In the BeatAML cohort we found 28 variants of interest, of which 25 variants can be validated by RNASeq data.
The samples of interest are those where the variant has been called either in DNA or RNA, and have RNASeq data available for validation. The splice junction search is made based on the STAR-SJCounts 1-based intronic positions.
Splicing alterations to be evaluated:
Variant found in 19 patients of the BeatAML (20 samples).
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"NRAS_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="NRAS" & found_variants$MutationKey_Hg38 == "chr1,114716123,C,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr1:114716126
Show all the splice junctions containing the position 114716126
colnames(GeneSJ)[grep("11471612",colnames(GeneSJ))]
## [1] "chr1_114713979_114716126"
Found: chr1:114713979-114716126
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr1_114713979_114716126
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [75] 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [223] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [260] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [297] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [334] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [371] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [408] 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [445] 0 0 0 0 0 0 0 0 0 0 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr1_114713979_114716126>0)
##
## FALSE TRUE
## 455 2
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr1_114713979_114716126 > 0])
##
## MUT WT
## 1 1
Alternative SJ found in the mutated samples.
Search: chr1:114713979-114716657
Show all the splice junctions containing the position 114713979
colnames(GeneSJ)[grep("114713979",colnames(GeneSJ))]
## [1] "chr1_114713979_114714441" "chr1_114713979_114715360"
## [3] "chr1_114713979_114715936" "chr1_114713979_114715946"
## [5] "chr1_114713979_114715953" "chr1_114713979_114716015"
## [7] "chr1_114713979_114716049" "chr1_114713979_114716076"
## [9] "chr1_114713979_114716126" "chr1_114713979_114716152"
## [11] "chr1_114713979_114716657"
Found: chr1:114713979-114716657
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr1_114713979_114716657
## [1] 0 14 0 4 7 0 0 1 3 0 0 0 8 31 0 15 0 0
## [19] 12 0 48 1 4 15 68 7 7 1 10 2 1 3 4 0 1 8
## [37] 1 6 4 42 0 0 3 1 3 21 0 5 0 2 1 33 3 0
## [55] 3 5 7 28 5 0 4 28 0 16 2 9 0 5 0 0 0 10
## [73] 7 2 9 0 4 1 50 0 1 0 110 7 9 0 8 0 5 2
## [91] 5 4 17 26 55 1 0 1 0 0 5 0 1 12 0 0 6 11
## [109] 0 0 1 0 4 0 19 1 1 0 2 6 3 19 0 2 0 20
## [127] 21 0 3 4 13 0 1 4 8 0 0 8 4 4 3 3 16 33
## [145] 0 67 3 138 0 1 0 19 3 0 0 2 3 14 3 5 0 14
## [163] 0 4 0 0 28 2 0 2 33 4 15 12 1 11 2 0 0 1
## [181] 0 0 0 2 12 0 0 0 0 1 3 15 100 2 0 8 0 0
## [199] 1 1 2 45 5 23 3 0 2 4 2 6 4 8 9 1 44 5
## [217] 38 0 4 6 8 14 3 1 1 147 2 4 0 1 0 10 12 7
## [235] 4 0 5 0 0 2 0 34 5 0 8 74 6 0 7 3 3 14
## [253] 0 0 0 2 43 1 6 0 0 0 3 0 1 45 1 3 4 0
## [271] 27 5 2 0 4 0 0 0 1 1 3 1 8 59 2 4 33 5
## [289] 4 4 1 8 92 12 0 1 7 0 2 1 2 2 8 2 2 0
## [307] 5 4 1 7 0 0 3 2 2 0 6 1 7 11 2 0 56 1
## [325] 5 0 0 2 8 3 8 13 17 3 18 4 0 0 6 1 2 6
## [343] 3 4 0 0 7 1 3 2 79 10 4 0 10 8 46 3 16 10
## [361] 2 6 0 1 34 1 0 10 63 1 1 3 2 8 10 5 1 7
## [379] 0 1 0 6 3 6 4 0 4 1 0 4 5 0 1 0 2 0
## [397] 4 0 0 0 0 5 0 0 2 3 1 5 8 0 0 9 0 5
## [415] 2 5 5 1 0 16 13 0 10 8 0 7 48 1 2 0 0 0
## [433] 0 4 12 6 1 9 0 7 6 2 0 7 6 0 0 12 1 0
## [451] 1 2 2 0 14 1 0
Samples with the SJ of interest:
table(GeneSJ$chr1_114713979_114716657>0)
##
## FALSE TRUE
## 135 322
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr1_114713979_114716657 > 0])
##
## MUT WT
## 12 310
Alternative SJ found in the mutated samples.
Search: chr1:114713979-114716076
Show all the splice junctions containing the positions between 114716070 - 114716079
colnames(GeneSJ)[grep("11471607",colnames(GeneSJ))]
## [1] "chr1_114713979_114716076"
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr1_114713979_114716076
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [223] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [260] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [297] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [334] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [371] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [408] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [445] 0 0 0 0 0 0 0 0 0 0 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr1_114713979_114716076 >0)
##
## FALSE TRUE
## 456 1
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr1_114713979_114716076 > 0])
##
## WT
## 1
Alternative SJ not found in the mutated samples of the splice junction collection.
Exon1-2: chr1:114716178-114716657
Exon2-3: chr1:114713979-114716049
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr1_114716178_114716657
## [1] 60 249 71 7 68 75 108 101 69 49 99 74 182 219 111 94 96 95
## [19] 172 78 133 88 65 441 138 179 64 6 95 116 92 135 83 69 47 186
## [37] 74 142 131 182 112 76 157 118 80 222 7 111 58 123 146 99 67 66
## [55] 188 115 83 271 66 137 109 58 51 139 82 128 75 79 90 74 115 95
## [73] 165 107 140 73 235 150 133 110 68 46 131 85 125 52 128 140 90 75
## [91] 109 152 180 162 89 81 119 133 64 120 80 61 110 87 104 165 178 120
## [109] 36 55 158 59 111 98 104 125 107 113 53 170 71 179 121 91 136 165
## [127] 194 64 137 115 87 56 71 99 94 71 74 180 90 92 125 110 225 83
## [145] 95 80 31 163 120 92 83 531 132 96 111 121 153 116 77 94 85 294
## [163] 47 84 98 61 107 62 58 162 160 93 143 216 49 232 150 81 84 327
## [181] 102 204 78 108 133 148 62 42 113 41 54 67 111 62 84 95 96 109
## [199] 79 66 100 119 143 255 56 119 122 127 100 139 87 194 69 91 241 155
## [217] 139 93 96 251 92 228 70 241 70 263 79 112 88 61 103 180 134 26
## [235] 70 57 68 119 72 69 90 129 87 60 60 170 185 2 77 97 118 115
## [253] 70 86 97 134 92 159 138 66 95 75 100 101 54 246 111 92 101 120
## [271] 427 110 123 67 78 56 37 83 128 69 107 112 136 172 67 148 95 103
## [289] 29 112 85 93 275 145 121 118 57 134 129 28 155 211 107 60 69 83
## [307] 85 142 79 236 85 21 113 62 181 72 206 52 171 114 53 66 98 99
## [325] 70 82 37 65 96 22 105 132 229 80 155 107 109 117 96 113 90 132
## [343] 41 138 102 67 145 80 116 118 102 98 239 62 136 114 153 210 111 99
## [361] 87 90 126 87 123 78 87 126 79 147 120 103 90 128 132 70 78 139
## [379] 125 189 0 73 136 168 74 93 133 123 117 89 126 100 69 19 136 124
## [397] 90 53 104 101 115 55 107 70 127 143 68 167 137 35 149 93 57 146
## [415] 164 197 103 40 59 135 69 107 163 250 64 154 189 58 83 128 62 46
## [433] 0 72 54 306 106 140 139 152 128 225 180 90 190 54 24 127 130 107
## [451] 56 126 109 162 143 48 122
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr1_114713979_114716049
## [1] 98 156 122 0 124 195 223 216 62 127 117 109 158 475 173 153 91 204
## [19] 175 83 174 194 58 449 133 195 86 2 94 137 228 148 76 76 96 149
## [37] 73 309 108 287 339 85 184 112 89 292 51 67 141 103 130 133 56 136
## [55] 294 264 72 298 69 95 168 96 58 245 91 143 212 119 80 64 185 77
## [73] 137 150 146 118 414 300 142 168 177 69 222 116 142 137 238 158 113 100
## [91] 309 308 328 229 139 204 132 104 80 269 137 154 113 70 163 189 195 79
## [109] 89 81 227 107 152 179 69 184 316 125 93 114 120 240 109 169 288 209
## [127] 176 84 220 89 122 76 67 277 93 65 87 213 109 70 99 124 355 79
## [145] 87 162 46 202 210 108 121 510 219 116 175 174 203 159 82 183 97 302
## [163] 42 142 208 70 187 72 52 127 154 122 285 145 95 205 285 155 64 284
## [181] 170 162 176 105 245 217 114 84 103 77 129 163 189 48 134 111 85 115
## [199] 162 68 99 100 303 289 74 106 258 180 163 144 164 219 51 179 248 154
## [217] 105 136 76 233 73 239 117 264 129 251 96 98 99 117 100 321 200 38
## [235] 161 123 70 206 97 132 131 127 142 181 98 300 197 1 106 120 112 137
## [253] 88 149 92 152 96 165 194 49 101 41 81 165 85 166 186 117 85 133
## [271] 352 93 118 161 97 78 105 72 190 121 165 97 126 287 78 141 73 195
## [289] 71 94 123 378 216 118 95 143 184 124 120 163 177 279 164 102 106 142
## [307] 77 117 64 125 184 35 188 51 197 65 146 99 215 162 91 130 122 94
## [325] 116 160 149 90 177 48 123 113 341 100 197 136 174 156 135 87 79 182
## [343] 177 208 77 156 137 159 234 94 128 209 275 143 119 142 146 167 154 151
## [361] 224 140 254 93 94 85 97 160 212 170 125 205 108 103 280 134 155 247
## [379] 194 129 0 63 83 177 148 177 192 113 135 42 160 197 131 7 96 86
## [397] 75 85 101 167 116 132 90 72 135 274 46 292 286 72 149 80 52 86
## [415] 124 214 231 106 135 247 134 132 280 179 100 257 138 151 102 188 132 59
## [433] 2 133 110 285 119 172 124 164 110 160 206 102 213 60 21 113 143 247
## [451] 118 167 214 129 265 118 269
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonEx1_2 <- (GeneSJ$chr1_114716178_114716657)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonEx2_3 <- (GeneSJ$chr1_114713979_114716049)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_DG <- (GeneSJ$chr1_114713979_114716126)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_ES2 <- (GeneSJ$chr1_114713979_114716657)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical SJ:
Splicing alterations:
Canonical SJ:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ #BeatAML.NRAS.chr1-114716123-C-T.xlsx
Normality Test:
shapiro.test(SJCounts$Normalized_DG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_DG[SJCounts$GROUP == "WT"]
## W = 0.023436, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_DG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.0001201865
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_DG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.00000000 0.00000000 0.00000000 0.05428882 0.00000000 0.00000000
## [7] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
## [13] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_DG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts[SJCounts$GROUP == "MUT", c("sample_id", "Difference")]
## sample_id Difference
## 42 BA2093R -0.0001201865
## 47 BA2098R -0.0001201865
## 49 BA2101R -0.0001201865
## 87 BA2218R 0.0541686300
## 115 BA2276R -0.0001201865
## 126 BA2301R -0.0001201865
## 191 BA2470R -0.0001201865
## 219 BA2523R -0.0001201865
## 236 BA2564R -0.0001201865
## 286 BA2691R -0.0001201865
## 307 BA2731R -0.0001201865
## 315 BA2748R -0.0001201865
## 345 BA2822R -0.0001201865
## 361 BA2851R -0.0001201865
## 383 BA2901R -0.0001201865
## 393 BA2914R -0.0001201865
## 414 BA2956R -0.0001201865
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:2] = -0.00012019, 0.052762
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.9977273 0.9977273 0.9977273 1.0000000 0.9977273 0.9977273 0.9977273
## [8] 0.9977273 0.9977273 0.9977273 0.9977273 0.9977273 0.9977273 0.9977273
## [15] 0.9977273 0.9977273 0.9977273
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_DG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$Prediction <- "Donor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr1:114713979-114716126"
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_ES2[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES2[SJCounts$GROUP == "WT"]
## W = 0.56704, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_ES2[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.4989328
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_ES2[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.0000000 0.0000000 0.0000000 0.4343105 2.3058252 0.9638554 0.2811621
## [8] 0.4629630 0.0000000 0.3218021 0.5347594 0.1421464 0.0000000 0.1232286
## [15] 0.3064351 0.1039501 0.4566210
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_ES2 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts[SJCounts$GROUP == "MUT", c("sample_id", "Difference")]
## sample_id Difference
## 42 BA2093R -0.49893283
## 47 BA2098R -0.49893283
## 49 BA2101R -0.49893283
## 87 BA2218R -0.06462230
## 115 BA2276R 1.80689241
## 126 BA2301R 0.46492259
## 191 BA2470R -0.21777070
## 219 BA2523R -0.03596987
## 236 BA2564R -0.49893283
## 286 BA2691R -0.17713074
## 307 BA2731R 0.03582652
## 315 BA2748R -0.35678642
## 345 BA2822R -0.49893283
## 361 BA2851R -0.37570425
## 383 BA2901R -0.19249770
## 393 BA2914R -0.39498273
## 414 BA2956R -0.04231183
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:302] = -0.49893, -0.45996, -0.45575, ..., 4.702, 5.2631
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.2954545 0.2954545 0.2954545 0.6909091 0.9409091 0.8704545 0.5818182
## [8] 0.7068182 0.2954545 0.6159091 0.7545455 0.4340909 0.2954545 0.4000000
## [15] 0.6000000 0.3795455 0.7022727
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_ES2")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr1:114713979-114716657"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality:
shapiro.test(SJCounts$Normalized_ES2[SJCounts$GROUP== "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES2[SJCounts$GROUP == "WT"]
## W = 0.56704, p-value < 2.2e-16
shapiro.test(SJCounts$Normalized_ES2[SJCounts$GROUP== "MUT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES2[SJCounts$GROUP == "MUT"]
## W = 0.65546, p-value = 3.774e-05
Mann-Whitney:
wt <- wilcox.test(x=SJCounts$Normalized_ES2[SJCounts$GROUP== "MUT"],
y=SJCounts$Normalized_ES2[SJCounts$GROUP== "WT"],
alternative = "two.sided",
paired = FALSE,
conf.int = 0.95)
wt
##
## Wilcoxon rank sum test with continuity correction
##
## data: SJCounts$Normalized_ES2[SJCounts$GROUP == "MUT"] and SJCounts$Normalized_ES2[SJCounts$GROUP == "WT"]
## W = 3703, p-value = 0.9448
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -0.1358253 0.1232654
## sample estimates:
## difference in location
## -1.195008e-05
Variant found in 8 patients of the BeatAML (8 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"KRAS_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="KRAS" & found_variants$MutationKey_Hg38 == "chr12,25245347,C,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 1bp from the variant 25245347, chr12:25245345
Show all the splice junctions containing the positions from 25245340 to 25245349
colnames(GeneSJ)[grepl("2524534", colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr12:25245281-25245348
Show all the splice junctions containing the positions from 25245340 to 25245349
colnames(GeneSJ)[grepl("2524534", colnames(GeneSJ))]
## character(0)
Show all the splice junctions containing the positions from 25245280 to 25245289
colnames(GeneSJ)[grepl("2524528", colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Mutated samples vaf:
Variant found in 4 patients of the BeatAML (4 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"KMT2D_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="KMT2D" & found_variants$MutationKey_Hg38 == "chr12,49022063,G,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr12:49022064
Show all the splice junctions containing the positions from 49022060 to 49022069
colnames(GeneSJ)[grepl("4902206", colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr12:49022070 Show all the splice junctions containing the positions from 49022070 to 49022079
colnames(GeneSJ)[grepl("4902207", colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Mutated samples vaf:
Variant found in 3 patients of the BeatAML (3 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"FLT3_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="FLT3" & found_variants$MutationKey_Hg38 == "chr13,28018485,G,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr13:28015702-28023349
Show all the splice junctions containing the position 28015702_28023349
colnames(GeneSJ)[grep("28015702_28023349",colnames(GeneSJ))]
## [1] "chr13_28015702_28023349"
Found: chr13:28015702-28023349
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr13_28015702_28023349
## [1] 25 16 4 6 8 0 20 37 18 13 15 69 18 8 62 19 69 31
## [19] 4 9 14 23 21 13 71 10 8 1 0 32 61 12 34 8 1 13
## [37] 14 12 36 1 51 14 162 19 8 31 6 4 10 40 130 8 8 26
## [55] 2 26 1 22 22 3 0 0 4 13 12 0 26 7 25 34 7 19
## [73] 2 33 1 33 0 25 5 48 20 10 47 22 17 16 4 2 2 3
## [91] 86 64 5 1 51 30 22 42 16 9 36 25 24 17 59 9 45 10
## [109] 0 0 14 23 15 3 6 57 28 21 0 35 8 2 27 12 41 26
## [127] 5 2 32 29 13 15 9 8 11 4 9 2 15 0 14 40 6 26
## [145] 12 9 11 52 28 33 3 23 34 23 25 34 52 14 6 12 16 60
## [163] 2 30 29 0 15 10 21 13 9 0 19 12 8 1 36 20 27 3
## [181] 4 71 22 45 26 13 2 67 23 0 14 3 7 11 37 3 8 7
## [199] 6 31 23 7 15 1 13 13 17 44 167 28 20 8 0 58 16 9
## [217] 11 17 15 23 0 29 7 11 33 44 5 43 6 3 8 16 11 88
## [235] 39 5 0 8 7 4 0 117 30 18 15 21 23 1 32 18 0 17
## [253] 17 9 5 39 3 10 15 32 23 6 4 10 16 7 16 33 14 0
## [271] 13 15 5 9 8 17 16 36 24 28 16 10 70 8 3 10 12 7
## [289] 6 18 10 0 124 4 14 27 0 14 1 6 10 5 30 20 21 1
## [307] 18 5 5 94 30 3 61 6 9 19 9 0 10 21 1 7 13 13
## [325] 17 35 0 5 3 7 53 21 8 4 24 19 30 18 25 24 25 36
## [343] 15 83 18 17 22 12 6 23 37 19 8 0 2 2 45 12 59 30
## [361] 20 0 0 7 7 5 23 15 18 10 119 33 5 0 11 3 1 0
## [379] 55 10 0 14 8 20 1 8 6 99 21 2 5 28 7 2 14 15
## [397] 19 12 37 12 37 0 5 2 95 21 21 2 46 4 33 12 7 6
## [415] 30 4 11 2 17 3 45 51 7 21 11 13 95 23 73 12 22 10
## [433] 0 15 13 14 17 7 5 0 8 2 10 1 17 9 0 9 35 12
## [451] 5 35 39 20 3 2 61
Samples with the SJ of interest:
table(GeneSJ$chr13_28015702_28023349>0)
##
## FALSE TRUE
## 33 424
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr13_28015702_28023349 > 0])
##
## MUT WT
## 1 423
Alternative SJ found in the mutated samples.
Search: chr13:28018590-28024860
Show all the splice junctions containing the positions 28018590-28024860
colnames(GeneSJ)[grep("28018590_28024860",colnames(GeneSJ))]
## [1] "chr13_28018590_28024860"
Found: chr13:28018590-28024860
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr13_28018590_28024860
## [1] 3 4 2 4 13 0 4 7 0 2 0 3 7 3 1 1 8 3 9 5 5 4 7 2 27
## [26] 1 4 1 0 2 2 17 2 0 0 14 3 8 7 4 2 4 4 2 7 23 1 15 2 0
## [51] 14 0 1 4 4 2 1 3 14 0 6 0 0 25 0 1 4 2 2 2 0 9 3 2 0
## [76] 2 4 6 9 13 2 0 78 4 30 1 3 0 0 0 1 53 5 1 13 4 4 3 0 3
## [101] 15 3 5 3 4 1 24 0 0 0 0 1 24 0 0 9 4 1 0 2 16 0 2 0 3
## [126] 1 22 1 6 15 0 4 0 5 0 0 2 5 0 0 1 3 21 2 3 0 0 11 4 6
## [151] 0 21 91 0 6 8 3 6 0 2 2 9 1 14 6 0 23 5 2 2 0 0 6 1 1
## [176] 2 1 5 3 13 3 5 2 6 5 8 1 11 0 0 0 2 2 0 7 2 1 0 1 2
## [201] 1 5 20 0 0 11 1 4 3 6 9 4 1 2 22 3 0 2 2 8 1 25 6 11 1
## [226] 13 0 10 0 0 1 15 12 3 4 1 0 4 1 1 2 5 4 2 2 7 21 0 1 36
## [251] 2 26 3 0 3 1 0 0 6 6 2 4 0 2 1 2 3 6 22 0 9 7 1 7 1
## [276] 0 2 1 18 6 2 4 1 10 0 0 4 8 0 3 5 10 3 1 3 6 0 2 0 4
## [301] 0 23 3 0 5 0 4 5 76 8 3 0 3 0 0 1 8 2 11 10 0 1 14 17 5
## [326] 6 0 0 0 0 3 4 4 1 6 2 5 1 11 0 22 7 9 16 5 0 2 3 4 0
## [351] 8 6 8 0 11 0 6 4 7 2 7 1 0 7 2 0 7 5 2 1 8 3 0 1 3
## [376] 0 0 7 3 1 0 10 4 30 3 0 16 1 2 0 23 10 2 2 2 1 0 7 1 0
## [401] 5 0 0 2 3 4 2 1 19 0 2 16 0 11 3 7 4 0 2 6 5 7 18 15 4
## [426] 6 27 1 5 3 7 1 0 7 2 6 6 10 1 1 7 15 2 0 8 0 0 17 5 1
## [451] 2 8 4 3 2 1 0
Samples with the SJ of interest:
table(GeneSJ$chr13_28018590_28024860>0)
##
## FALSE TRUE
## 102 355
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr13_28018590_28024860 > 0])
##
## MUT WT
## 1 354
Alternative SJ found in the mutated samples.
Search: chr13:28015702-28024860
Show all the splice junctions containing the positions 28015702-28024860
colnames(GeneSJ)[grep("28015702_28024860",colnames(GeneSJ))]
## [1] "chr13_28015702_28024860"
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr13_28015702_28024860
## [1] 0 7 0 6 3 0 0 2 0 2 0 0 3 1 0 0 1 0 3 0 6 0 4 3 46
## [26] 4 0 1 0 1 0 7 0 0 0 3 2 0 4 0 0 1 1 0 6 9 5 19 0 0
## [51] 19 0 1 0 3 0 0 3 3 0 0 0 0 12 0 0 0 1 0 1 1 6 1 2 0
## [76] 0 0 0 3 0 0 1 58 1 6 0 1 0 0 0 1 6 1 1 4 1 0 1 0 0
## [101] 5 0 1 4 0 0 16 0 0 0 0 2 11 0 1 0 0 2 0 0 4 0 1 0 1
## [126] 9 6 0 3 5 0 0 0 3 0 0 0 1 0 0 5 0 3 4 0 0 0 6 1 2
## [151] 0 19 36 0 4 2 2 1 0 2 0 7 0 7 1 0 16 3 0 3 3 1 2 5 1
## [176] 5 1 1 3 0 0 2 0 13 6 3 0 7 0 4 0 0 0 2 0 2 0 0 0 0
## [201] 0 3 32 0 1 3 0 1 7 3 6 5 0 0 17 6 0 0 4 8 0 7 0 11 0
## [226] 15 0 7 1 0 0 17 10 4 5 1 0 5 0 0 0 7 2 0 2 2 14 0 1 6
## [251] 0 2 0 0 0 0 1 0 0 0 0 1 0 0 0 2 0 3 15 0 2 5 0 0 0
## [276] 0 0 0 16 0 0 1 5 8 5 0 4 8 0 2 1 0 2 3 1 0 0 0 0 0
## [301] 0 6 4 0 0 0 1 3 47 21 0 0 0 1 0 0 4 0 5 2 0 0 5 6 0
## [326] 0 0 1 4 2 2 3 5 1 5 0 0 0 5 1 30 7 0 4 0 0 7 0 1 0
## [351] 15 0 7 0 2 0 4 5 3 0 0 0 0 9 0 0 2 7 2 0 25 8 0 2 4
## [376] 0 0 5 0 0 0 2 1 13 5 0 10 2 0 0 11 0 0 2 3 0 0 0 0 1
## [401] 0 0 1 0 6 0 2 0 15 0 0 15 0 8 1 2 1 0 0 0 8 0 7 7 0
## [426] 1 50 1 4 0 0 1 0 0 0 5 1 6 0 0 5 3 0 0 2 0 0 6 0 0
## [451] 0 0 0 2 2 0 0
Samples with the SJ of interest:
table(GeneSJ$chr13_28015702_28024860 >0)
##
## FALSE TRUE
## 224 233
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr13_28015702_28024860 > 0])
##
## MUT WT
## 1 232
Alternative SJ found in the mutated samples.
Exon 19-20 chr13:28018590-28023349
Exon 20-21 chr13:28015702-28018466; splice site chr13:28018466
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr13_28018590_28023349
## [1] 748 302 315 13 308 11 1210 1229 515 322 1004 1831 552 236 2249
## [16] 391 1778 1554 188 983 209 1002 400 170 1468 237 504 74 2 1069
## [31] 3007 530 1178 265 12 601 637 417 976 43 1461 805 4251 743 367
## [46] 1939 743 77 212 745 1789 92 361 892 57 745 12 152 1741 183
## [61] 415 10 170 617 566 52 1533 46 701 867 406 847 151 1760 11
## [76] 1111 77 1330 142 1288 1062 693 1273 929 778 481 116 55 21 114
## [91] 1107 2697 43 47 1167 2015 1349 805 1035 836 2130 1040 1022 495 1614
## [106] 267 1061 465 105 11 500 1220 792 331 103 4216 1837 731 20 1017
## [121] 327 11 722 490 760 546 722 62 1790 768 280 1958 685 521 421
## [136] 185 374 87 785 20 304 1758 163 413 588 64 353 1082 889 710
## [151] 195 932 2657 1034 1274 1807 3147 621 532 234 927 1354 91 857 1132
## [166] 61 1048 604 2019 814 280 42 381 245 205 171 1883 1840 748 113
## [181] 420 1974 586 1124 427 321 91 2041 851 12 119 57 81 704 2238
## [196] 83 493 415 905 1041 800 59 297 3 324 240 972 1496 1595 928
## [211] 942 425 24 820 384 96 337 1391 918 414 11 707 679 288 1247
## [226] 850 101 2143 398 110 256 1118 356 1431 906 397 3 329 228 159
## [241] 231 2073 1121 918 708 342 481 120 1324 702 4 1110 797 706 425
## [256] 973 136 645 740 2412 1010 564 282 519 749 57 766 1000 407 3
## [271] 265 331 148 768 345 508 756 1405 359 1512 1055 369 1264 213 92
## [286] 360 480 178 252 332 1066 356 3096 165 563 1439 21 429 18 1032
## [301] 240 157 1159 777 1204 256 312 254 337 3073 1114 263 698 898 260
## [316] 868 361 78 445 1265 23 365 622 660 675 1180 132 323 198 222
## [331] 2166 622 312 82 894 566 1948 760 1076 874 741 1117 2441 4206 626
## [346] 785 362 1123 299 659 728 1091 286 3 90 33 860 230 2750 889
## [361] 1636 72 12 53 114 186 1028 397 290 429 6005 1146 137 8 108
## [376] 34 225 215 1918 611 0 818 212 522 33 657 408 1743 1460 206
## [391] 436 2691 196 50 560 686 498 741 1051 468 1921 12 503 253 1820
## [406] 1293 686 73 1496 502 2013 233 490 243 1777 192 560 331 683 154
## [421] 1384 1549 341 388 644 573 1174 1417 1674 1419 1142 795 22 654 466
## [436] 307 632 330 219 28 162 112 362 97 499 158 18 344 1260 804
## [451] 264 1254 1083 964 227 60 2238
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr13_28015702_28018466
## [1] 648 324 222 48 270 14 907 1020 534 254 779 1725 565 209 2251
## [16] 350 2036 1250 224 970 148 843 402 206 1714 203 407 69 3 1171
## [31] 2450 504 1164 247 16 591 646 388 1007 42 1080 735 4491 735 421
## [46] 1653 117 126 178 759 1867 78 374 847 47 584 10 177 1426 170
## [61] 353 6 182 476 511 52 1076 33 664 891 322 909 160 1417 8
## [76] 902 74 1090 167 1179 909 575 1083 819 764 361 75 49 26 112
## [91] 894 2326 37 43 974 1571 1325 767 1047 635 1708 846 962 487 1520
## [106] 268 955 514 78 7 466 998 703 252 78 3551 1452 776 14 1074
## [121] 227 14 765 390 615 493 780 59 1446 784 231 1534 703 433 405
## [136] 192 393 101 675 28 328 1812 151 399 629 52 317 899 658 532
## [151] 202 1011 2468 1074 1229 1558 2548 496 563 208 824 1418 115 728 875
## [166] 53 853 575 1941 916 325 39 288 246 186 176 1472 1338 737 115
## [181] 337 2010 436 1074 340 293 59 1702 843 8 104 38 93 878 1870
## [196] 80 474 423 611 1214 789 76 216 10 360 214 781 1271 1360 1091
## [211] 838 469 32 704 441 106 346 1200 875 413 24 739 597 355 988
## [226] 969 92 2063 412 74 273 943 310 1603 709 342 8 242 222 151
## [241] 181 2257 950 833 554 280 618 126 1117 608 9 1024 775 607 424
## [256] 1144 139 742 719 2660 871 536 291 451 538 61 674 1003 448 2
## [271] 236 343 145 588 302 548 577 1499 354 1236 947 392 1355 186 91
## [286] 337 516 191 217 328 894 44 3326 166 566 1536 3 430 23 129
## [301] 211 140 998 547 1012 194 330 263 429 3303 788 235 581 865 278
## [316] 963 392 68 342 1059 10 280 611 629 565 1038 21 242 131 158
## [331] 2357 618 311 73 1013 499 1569 760 917 886 754 883 362 3557 679
## [346] 603 428 872 280 689 858 986 306 0 120 23 899 253 2176 703
## [361] 1264 57 9 54 136 202 1062 350 212 454 6471 962 126 9 88
## [376] 27 168 136 1579 652 0 880 208 514 35 560 347 1884 1520 215
## [391] 441 2191 164 35 581 796 519 595 1149 325 2032 9 518 258 1880
## [406] 1025 686 68 1254 349 2134 234 403 295 1918 221 450 261 518 117
## [421] 1096 1630 224 403 511 446 1374 1120 1400 1245 1213 786 27 567 379
## [436] 328 630 309 246 36 188 138 308 65 576 146 16 390 1144 609
## [451] 197 924 941 1002 176 48 1735
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonEx19_20 <- (GeneSJ$chr13_28018590_28023349)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonEx20_21 <- (GeneSJ$chr13_28015702_28018466)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_SE20 <- (GeneSJ$chr13_28015702_28023349)/GeneSJ$rowSum_SJtotal*100
GeneSJ$INCLUSION_Ex20 <- GeneSJ$chr13_28018590_28023349 + GeneSJ$chr13_28015702_28018466
GeneSJ$PSI_SE20 <- (GeneSJ$INCLUSION_Ex20)/(GeneSJ$chr13_28015702_28023349+GeneSJ$INCLUSION_Ex20)
GeneSJ$Normalized_SE19 <- (GeneSJ$chr13_28018590_28024860)/GeneSJ$rowSum_SJtotal*100
GeneSJ$INCLUSION_Ex19 <- GeneSJ$chr13_28018590_28023349 + GeneSJ$chr13_28023478_28024860
GeneSJ$PSI_SE19 <- (GeneSJ$INCLUSION_Ex19)/(GeneSJ$chr13_28018590_28024860+GeneSJ$INCLUSION_Ex19)
GeneSJ$Normalized_SE1920 <- (GeneSJ$chr13_28015702_28024860)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction: Exon 20-21 chr13:28015702-28018466; donor splice site chr13:28018466
Splicing alterations:
Canonical splice junction: Exon 20-21 chr13:28015702-28018466; donor splice site chr13:28018466
Splicing alterations:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx20_21[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx20_21[SJCounts$GROUP == "WT"]
## W = 0.9067, p-value = 4.086e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx20_21[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 5.304403
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx20_21[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 4.803404
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx20_21 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.5009997
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:455] = -5.3044, -4.4473, -4.2071, ..., 5.2219, 6.4165
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.2872807
print(paste0("MUT Percentile: ", v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "MUT Percentile: 0.287280701754386"
print(paste0("Inferred Pvalue: ", v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "Inferred Pvalue: 0.287280701754386"
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx20_21")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr13:28015702-28018466"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_SE20[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_SE20[SJCounts$GROUP == "WT"]
## W = 0.77609, p-value < 2.2e-16
Value of Mean Normalized Expression of the SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_SE20[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.1719062
Normalized Expression Value of the SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_SE20[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.2504632
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_SE20 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.07855696
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:424] = -0.17191, -0.151, -0.14868, ..., 0.8037, 1.1439
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.8267544
print(paste0("MUT Percentile: ", v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "MUT Percentile: 0.826754385964912"
1-v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.1732456
print(paste0("Inferred Pvalue: ", 1-v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "Inferred Pvalue: 0.173245614035088"
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_SE19[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_SE19[SJCounts$GROUP == "WT"]
## W = 0.56862, p-value < 2.2e-16
Value of Mean Normalized Expression of the SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_SE19[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.07528691
Normalized Expression Value of the SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_SE19[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.01715501
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_SE19 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.0581319
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:355] = -0.075287, -0.072386, -0.072168, ..., 0.80191, 1.2068
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.4232456
print(paste0("MUT Percentile: ", v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "MUT Percentile: 0.423245614035088"
1-v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.5767544
print(paste0("Inferred Pvalue: ", 1-v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "Inferred Pvalue: 0.576754385964912"
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.4232456
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_SE19")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1- MUT_df$ECDF
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr13:28018590-28024860"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_SE1920[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_SE1920[SJCounts$GROUP == "WT"]
## W = 0.3588, p-value < 2.2e-16
Value of Mean Normalized Expression of the SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_SE1920[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.04698132
Value of Mean Normalized Expression of the SJ in WT samples:
MUT_SJi <- SJCounts$Normalized_SE1920[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.01372401
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_SE1920 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.03325731
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:233] = -0.046981, -0.045737, -0.044446, ..., 1.2688, 1.4076
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.6074561
print(paste0("MUT Percentile: ", v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "MUT Percentile: 0.607456140350877"
1-v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.3925439
print(paste0("Inferred Pvalue: ", 1-v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])))
## [1] "Inferred Pvalue: 0.392543859649123"
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.6074561
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_SE1920")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1- MUT_df$ECDF
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr13:28015702-28024860"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Three (3) variants: chr2,208248389,G,A, chr2,208248389,G,T & chr2,208248389,G,C
Variants found in 30 patients of the BeatAML (35 samples)
Patients with the variant and RNASeq for validation:
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"IDH1_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="IDH1" & found_variants$MutationKey_Hg38 %in% c( "chr2,208248389,G,A","chr2,208248389,G,T" ,"chr2,208248389,G,C"),]
R132C <- samples_df$RNA_Sample[samples_df$MutationKey_Hg38 == "chr2,208248389,G,A" & samples_df$Validable =="Validable"] #n=15 G>A
R132G <- samples_df$RNA_Sample[samples_df$MutationKey_Hg38 == "chr2,208248389,G,C" & samples_df$Validable =="Validable"] #n=3 G>C
R132S <- samples_df$RNA_Sample[samples_df$MutationKey_Hg38 == "chr2,208248389,G,T" & samples_df$Validable =="Validable"] #n=4 G>T
cases <- append(R132C, R132G)
cases <- append(cases, R132S)
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% R132C, "MUT",
ifelse(GeneSJ$sample_id %in% R132G, "MUT",
ifelse(GeneSJ$sample_id %in% R132S, "MUT",
"WT")))
GeneSJ$Variant_status <- ifelse(GeneSJ$sample_id %in% R132C, "R132C",
ifelse(GeneSJ$sample_id %in% R132G, "R132G",
ifelse(GeneSJ$sample_id %in% R132S, "R132S",
"WT")))
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr2:208245425-208248422
Show all the splice junctions containing the position 208245425-208248422
colnames(GeneSJ)[grep("208245425_208248422",colnames(GeneSJ))]
## [1] "chr2_208245425_208248422"
Found: chr2_208245425_208248422
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_208245425_208248422
## [1] 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0
## [38] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
## [112] 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
## [149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [186] 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
## [223] 0 0 0 0 0 0 0 0 0 9 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
## [260] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0
## [297] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [334] 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [371] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [408] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0
## [445] 0 0 0 0 0 1 0 0 0 0 0 0 2
Samples with the SJ of interest:
table(GeneSJ$chr2_208245425_208248422>0)
##
## FALSE TRUE
## 427 30
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr2_208245425_208248422 > 0])
##
## MUT WT
## 7 23
Alternative SJ found in the mutated samples.
Search: chr2:208245425-208251429
Show all the splice junctions containing the positions 208245425-208251429
colnames(GeneSJ)[grep("208245425_208251429",colnames(GeneSJ))]
## [1] "chr2_208245425_208251429"
Found: chr2_208245425_208251429
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_208245425_208251429
## [1] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [112] 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [223] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [260] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [297] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [334] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [371] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [408] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [445] 0 0 0 0 1 0 0 0 0 0 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr2_208245425_208251429>0)
##
## FALSE TRUE
## 454 3
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr2_208245425_208251429 > 0])
##
## WT
## 3
Alternative SJ not found in the mutated samples of the splice junction collection.
Exon 3-4: chr2:208248661-208251429
Exon 4-5: chr2:208245425-208248368; splice site chr2:208248368
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_208248661_208251429
## [1] 730 57 172 4 15 197 349 485 352 132 270 379 109 114 186
## [16] 100 233 393 11 193 37 705 110 50 37 90 304 7 8 336
## [31] 464 41 353 231 184 24 269 38 43 38 779 110 777 141 82
## [46] 85 316 32 50 399 55 63 154 257 15 222 4 69 77 229
## [61] 58 16 179 25 453 15 97 26 217 103 155 82 35 244 57
## [76] 295 86 461 20 278 416 300 20 312 75 206 125 177 120 87
## [91] 667 133 24 52 57 681 395 299 275 917 77 544 111 68 488
## [106] 245 161 164 57 33 200 665 71 426 43 373 638 155 165 172
## [121] 54 105 98 294 470 59 27 527 203 61 161 106 925 208 333
## [136] 218 61 55 161 19 114 227 8 19 255 47 119 50 1051 171
## [151] 377 121 44 150 298 188 506 93 31 312 636 89 129 90 318
## [166] 135 232 71 116 95 52 77 110 9 102 92 326 969 205 46
## [181] 509 257 60 157 114 100 151 248 178 40 58 37 56 119 379
## [196] 7 347 307 570 192 194 10 260 26 384 14 434 244 352 49
## [211] 62 32 12 816 35 18 23 745 134 37 12 27 220 151 281
## [226] 40 221 49 168 271 372 110 86 317 317 320 7 107 278 249
## [241] 114 81 147 379 146 13 58 7 328 50 8 74 196 381 176
## [256] 414 74 184 128 104 174 46 11 238 203 19 201 176 23 52
## [271] 180 35 259 251 181 228 380 175 71 94 513 210 124 51 17
## [286] 295 30 30 53 84 183 155 85 143 348 458 18 318 17 340
## [301] 88 37 118 1517 88 297 113 16 89 193 2470 124 415 50 269
## [316] 254 20 210 118 94 85 439 53 32 322 532 590 316 365 104
## [331] 141 13 45 93 168 56 127 136 121 454 57 20 783 680 319
## [346] 655 200 428 74 89 42 121 74 133 19 43 33 89 103 110
## [361] 490 13 31 29 80 282 238 147 27 359 662 168 130 4 33
## [376] 94 287 27 1409 173 0 86 37 49 22 350 13 229 180 169
## [391] 147 229 467 3 24 181 742 245 261 583 437 23 90 157 95
## [406] 136 19 52 184 97 591 25 197 38 446 32 183 262 242 43
## [421] 103 277 62 79 83 134 21 169 220 683 182 158 1 143 127
## [436] 70 191 14 137 33 25 80 701 140 255 143 41 19 596 891
## [451] 244 377 437 317 68 36 2812
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_208245425_208248368
## [1] 1035 76 129 10 23 181 382 511 638 148 317 509 143 124 199
## [16] 93 327 399 36 284 24 615 143 75 51 81 373 13 20 428
## [31] 555 87 613 408 148 50 464 36 65 39 797 156 1036 273 180
## [46] 102 117 44 39 647 116 64 248 283 25 233 12 112 90 390
## [61] 78 12 259 36 615 26 103 34 364 157 121 165 36 353 92
## [76] 429 80 557 38 440 510 420 23 442 101 201 88 280 134 98
## [91] 845 124 41 59 64 761 585 393 470 930 100 474 170 116 612
## [106] 313 297 269 41 20 234 663 77 487 70 559 746 221 218 218
## [121] 52 136 154 314 558 115 33 771 249 110 222 153 1654 236 572
## [136] 379 91 72 192 46 253 397 30 23 330 61 166 66 1200 256
## [151] 553 195 71 234 330 246 673 125 35 282 844 123 202 125 281
## [166] 185 248 124 178 132 80 106 93 10 119 139 281 901 386 87
## [181] 465 359 51 269 100 62 174 286 226 26 65 21 49 262 629
## [196] 10 598 404 543 231 333 12 287 30 630 34 374 320 529 62
## [211] 52 52 21 948 42 23 55 1075 161 80 23 30 249 151 349
## [226] 45 265 107 282 322 641 60 109 572 257 349 20 63 578 320
## [241] 124 86 152 414 124 25 64 19 499 67 14 130 366 573 277
## [256] 559 101 264 200 138 332 62 27 377 173 37 166 366 32 117
## [271] 272 55 455 304 258 268 419 264 90 89 736 295 166 62 11
## [286] 397 44 20 43 134 210 43 153 243 668 672 1 493 22 87
## [301] 129 37 158 1283 132 268 199 22 91 383 2332 140 532 82 384
## [316] 459 27 154 176 93 72 545 53 30 460 430 156 415 316 102
## [331] 212 37 36 54 213 74 106 232 171 675 110 24 240 961 444
## [346] 612 289 494 60 118 45 77 100 166 52 43 56 143 139 111
## [361] 477 13 19 41 134 447 384 209 46 478 955 199 226 23 52
## [376] 115 262 10 2021 206 1 121 46 125 24 317 22 335 288 233
## [391] 261 214 377 8 35 312 1019 180 423 465 633 22 134 301 127
## [406] 175 55 59 208 73 874 40 276 32 894 49 215 281 275 65
## [421] 59 352 42 154 116 173 53 82 253 991 180 279 7 163 123
## [436] 74 314 38 222 57 60 71 996 175 291 222 92 27 823 947
## [451] 267 345 437 469 72 27 2835
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonEx3_4 <- (GeneSJ$chr2_208248661_208251429)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonEx4_5 <- (GeneSJ$chr2_208245425_208248368)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_DGEx4 <- (GeneSJ$chr2_208245425_208248422)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction: Exon 4-5: chr2:208245425-208248368; donor splice site chr2:208248368
Splicing alterations:
Canonical splice junction: Exon 4-5: chr2:208245425-208248368; donor splice site chr2:208248368
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_DGEx4[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_DGEx4[SJCounts$GROUP == "WT"]
## W = 0.16047, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_DGEx4[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.00342833
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_DGEx4[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.25252525 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
## [7] 0.00000000 0.20661157 0.30303030 0.00000000 0.07942812 1.21130552
## [13] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.52151239
## [19] 0.00000000 0.00000000 0.00000000 0.52521008
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_DGEx4 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts[SJCounts$GROUP == "MUT", c("sample_id", "Difference")]
## sample_id Difference
## 16 BA2035R 0.24909692
## 21 BA2046R -0.00342833
## 40 BA2088R -0.00342833
## 113 BA2273R -0.00342833
## 120 BA2286R -0.00342833
## 127 BA2302R -0.00342833
## 154 BA2387R -0.00342833
## 170 BA2421R 0.20318324
## 186 BA2459R 0.29960197
## 215 BA2514R -0.00342833
## 219 BA2523R 0.07599979
## 232 BA2552R 1.20787719
## 288 BA2695R -0.00342833
## 317 BA2756R -0.00342833
## 323 BA2769R -0.00342833
## 333 BA2798R -0.00342833
## 334 BA2804R -0.00342833
## 352 BA2837R 0.51808406
## 378 BA2883R -0.00342833
## 392 BA2911R -0.00342833
## 398 BA2926R -0.00342833
## 428 BA2999R 0.52178175
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:24] = -0.0034283, 0.0084984, 0.011035, ..., 0.22805, 0.24974
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.9977011 0.9471264 0.9471264 0.9471264 0.9471264 0.9471264 0.9471264
## [8] 0.9954023 1.0000000 0.9471264 0.9908046 1.0000000 0.9471264 0.9471264
## [15] 0.9471264 0.9471264 0.9471264 1.0000000 0.9471264 0.9471264 0.9471264
## [22] 1.0000000
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_DGEx4")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Donor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr2:208245425-208248422"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Kruskal Wallis:
kruskal.test(Normalized_DGEx4 ~ Variant_status, data = SJCounts)
##
## Kruskal-Wallis rank sum test
##
## data: Normalized_DGEx4 by Variant_status
## Kruskal-Wallis chi-squared = 69.916, df = 3, p-value = 4.448e-15
Pairwise comparisons:
pairwise.wilcox.test(SJCounts$Normalized_DGEx4, SJCounts$Variant_status)
##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: SJCounts$Normalized_DGEx4 and SJCounts$Variant_status
##
## R132C R132G R132S
## R132G 0.932 - -
## R132S 0.005 0.131 -
## WT 0.045 0.932 8.9e-16
##
## P value adjustment method: holm
Detailed:
pkw <- pairwise_wilcox_test(SJCounts,Normalized_DGEx4 ~ Variant_status)
pkw
## # A tibble: 6 × 9
## .y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
## 1 Normalized… R132C R132G 15 3 27 4.66e- 1 9.32e- 1 ns
## 2 Normalized… R132C R132S 15 4 1 9.92e- 4 5 e- 3 **
## 3 Normalized… R132C WT 15 435 3771 1.1 e- 2 4.5 e- 2 *
## 4 Normalized… R132G R132S 3 4 0 4.4 e- 2 1.31e- 1 ns
## 5 Normalized… R132G WT 3 435 618 6.87e- 1 9.32e- 1 ns
## 6 Normalized… R132S WT 4 435 1739 1.49e-16 8.94e-16 ****
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"]
## W = 0.93166, p-value = 3.253e-13
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 16.20588
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 11.742424 7.792208 18.224299 15.308151 14.082687 15.207373 16.283925
## [8] 13.636364 9.393939 13.592233 12.787927 8.075370 8.771930 11.392405
## [15] 12.649165 12.456747 9.872029 10.039113 4.716981 13.333333 11.029412
## [22] 8.613445
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx4_5 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -4.46345182 -8.41366827 2.01842300 -0.89772497 -2.12318872
## [6] -0.99850279 0.07804878 -2.56951242 -6.81193667 -2.61364305
## [11] -3.41794913 -8.13050594 -7.43394624 -4.81347100 -3.55671138
## [16] -3.74912866 -6.33384681 -6.16676263 -11.48889493 -2.87254273
## [21] -5.17646430 -7.59243068
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:430] = -14.319, -11.052, -10.923, ..., 9.6562, 11.191
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.045977011 0.018390805 0.777011494 0.335632184 0.165517241 0.328735632
## [7] 0.471264368 0.131034483 0.025287356 0.128735632 0.075862069 0.020689655
## [13] 0.022988506 0.045977011 0.071264368 0.064367816 0.029885057 0.032183908
## [19] 0.002298851 0.105747126 0.045977011 0.022988506
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx4_5")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr2:208245425-208248368"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Kruskal Wallis:
kruskal.test(Normalized_CanonEx4_5 ~ Variant_status, data = SJCounts)
##
## Kruskal-Wallis rank sum test
##
## data: Normalized_CanonEx4_5 by Variant_status
## Kruskal-Wallis chi-squared = 35.08, df = 3, p-value = 1.172e-07
Pairwise comparisons:
pairwise.wilcox.test(SJCounts$Normalized_CanonEx4_5, SJCounts$Variant_status)
##
## Pairwise comparisons using Wilcoxon rank sum exact test
##
## data: SJCounts$Normalized_CanonEx4_5 and SJCounts$Variant_status
##
## R132C R132G R132S
## R132G 0.6857 - -
## R132S 0.4974 0.6857 -
## WT 5.7e-06 0.6857 0.0061
##
## P value adjustment method: holm
Detailed:
pkw <- pairwise_wilcox_test(SJCounts,Normalized_CanonEx4_5 ~ Variant_status)
pkw
## # A tibble: 6 × 9
## .y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
## * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
## 1 Normalized_C… R132C R132G 15 3 18 6.54e-1 6.87e-1 ns
## 2 Normalized_C… R132C R132S 15 4 46 1.24e-1 4.96e-1 ns
## 3 Normalized_C… R132C WT 15 435 834 9.44e-7 5.66e-6 ****
## 4 Normalized_C… R132G R132S 3 4 10 2.29e-1 6.87e-1 ns
## 5 Normalized_C… R132G WT 3 435 404 2.56e-1 6.87e-1 ns
## 6 Normalized_C… R132S WT 4 435 53 1 e-3 6 e-3 **
Variant found in 3 patients of the BeatAML (3 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_pathUM <- paste0(extractedSJ_dir_in,"U2AF1_UM_annotSJ.tsv")
extractedSJ_pathMM <- paste0(extractedSJ_dir_in,"U2AF1_MM_annotSJ.tsv")
GeneSJ_UM <- read.delim(extractedSJ_pathUM, sep ="\t")
GeneSJ_MM <- read.delim(extractedSJ_pathMM, sep ="\t")
GeneSJ <- GeneSJ_UM[,grep("chr", names(GeneSJ_UM))] + GeneSJ_MM[,grep("chr", names(GeneSJ_MM))]
GeneSJ$INDEX <- GeneSJ_UM$INDEX
GeneSJ$sample_id <- GeneSJ_UM$sample_id
GeneSJ$case_id <- GeneSJ_UM$case_id
GeneSJ$file_id.BAM <- GeneSJ_UM$file_id.BAM
GeneSJ$file_name.BAM <- GeneSJ_UM$file_name.BAM
GeneSJ$file_id.STAR.SJCounts <- GeneSJ_UM$file_id.STAR.SJCounts
GeneSJ$file_name.STAR.SJCounts <- GeneSJ_UM$file_name.STAR.SJCounts
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="U2AF1" & found_variants$MutationKey_Hg38 == "chr21,43094667,T,C",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr21:43094564-43094666
Show all the splice junctions containing the position chr21:43094564-43094666
colnames(GeneSJ)[grep("43094564_43094666",colnames(GeneSJ))]
## [1] "chr21_43094564_43094666"
Found: chr21:43094564-43094666
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr21_43094564_43094666
## [1] 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0
## [26] 0 1 0 0 0 0 2 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0
## [51] 0 1 0 0 1 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [76] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 1
## [101] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [126] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 10
## [151] 0 0 0 0 0 2 0 0 0 0 0 0 9 0 0 0 0 0 0 0 0 0 0 0 0
## [176] 0 0 0 0 0 0 0 10 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
## [201] 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## [226] 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0
## [251] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [276] 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
## [301] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
## [326] 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 2 0 0 0 0 0 0
## [351] 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
## [376] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [401] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## [426] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [451] 0 0 0 0 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr21_43094564_43094666>0)
##
## FALSE TRUE
## 416 41
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr21_43094564_43094666 > 0])
##
## MUT WT
## 3 38
Alternative SJ found in the mutated samples.
Search: chr21:43094564-43095437
colnames(GeneSJ)[grep("43094564",colnames(GeneSJ))]
## [1] "chr21_43094564_43094654" "chr21_43094564_43094666"
## [3] "chr21_43094564_43094670"
colnames(GeneSJ)[grep("43095437",colnames(GeneSJ))]
## [1] "chr21_43094723_43095437" "chr21_43094789_43095437"
## [3] "chr21_43094825_43095437" "chr21_43095002_43095437"
## [5] "chr21_43095024_43095437" "chr21_43095051_43095437"
Show all the splice junctions containing the position 43094564_43095437
colnames(GeneSJ)[grep("43094564_43095437",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr21:43094789-43095693
Show all the splice junctions containing the position chr21:43094789-43095693
colnames(GeneSJ)[grep("43094789_43095693",colnames(GeneSJ))]
## [1] "chr21_43094789_43095693"
Found: chr21:43094789-43095693
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr21_43094789_43095693
## [1] 0 84 16 669 134 13 13 14 30 59 7 7 154 377 8 29 25 12
## [19] 349 18 223 14 53 832 313 88 112 22 248 17 10 166 11 13 31 97
## [37] 22 166 62 313 13 33 25 2 96 226 57 349 28 8 51 296 38 0
## [55] 99 10 269 380 72 11 144 218 16 103 10 114 15 49 0 33 16 55
## [73] 276 28 54 30 446 29 357 9 25 4 557 14 128 14 168 15 124 20
## [91] 33 44 430 216 112 14 10 17 4 13 40 0 0 66 14 11 83 74
## [109] 16 59 81 10 229 11 47 16 9 10 36 36 141 282 45 42 14 229
## [127] 118 16 14 134 48 24 11 22 54 10 82 113 30 162 164 13 485 75
## [145] 4 174 8 364 14 11 8 267 114 29 23 37 64 25 18 65 8 114
## [163] 16 107 41 12 191 98 9 12 340 12 47 251 6 355 33 26 15 27
## [181] 0 27 24 45 229 35 38 136 0 58 17 148 266 28 19 114 17 7
## [199] 10 0 6 339 177 536 20 25 11 29 25 188 37 66 121 15 513 237
## [217] 131 17 16 176 208 103 30 164 8 147 44 40 3 14 14 92 338 68
## [235] 24 20 359 45 0 14 19 99 16 11 15 585 169 0 5 53 178 195
## [253] 11 14 10 34 137 17 26 2 17 32 26 11 12 302 18 21 54 35
## [271] 306 64 44 8 48 3 18 6 61 10 16 53 98 249 143 12 315 110
## [289] 24 114 8 210 216 74 26 7 84 0 69 29 17 149 167 26 19 13
## [307] 70 176 63 160 16 0 41 17 49 0 96 40 147 159 127 17 162 54
## [325] 28 10 33 43 24 75 28 82 220 40 114 35 0 12 76 3 103 11
## [343] 24 47 18 7 35 9 26 46 182 45 191 144 243 186 123 79 73 81
## [361] 15 319 114 266 189 15 6 319 168 7 24 87 30 344 324 77 28 153
## [379] 12 12 0 46 46 158 276 8 175 9 8 31 26 7 8 168 96 15
## [397] 13 21 16 11 10 92 29 34 36 4 30 153 62 22 11 81 11 117
## [415] 11 201 46 20 12 209 24 25 334 219 19 83 198 21 15 0 15 12
## [433] 2 25 19 263 37 174 0 297 177 55 25 46 84 17 35 265 7 12
## [451] 14 31 13 33 65 144 9
Samples with the SJ of interest:
table(GeneSJ$chr21_43094789_43095693>0)
##
## FALSE TRUE
## 17 440
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr21_43094789_43095693 > 0])
##
## MUT WT
## 3 437
Alternative SJ found in the mutated samples.
Exon 4-5: chr21:43095537-43095693
Exon 5-6: chr21:43094789-43095437
Exon 6-7: chr21:43094564-43094654; splice site chr21:43094554
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr21_43095537_43095693 #SJ Intron 4 to 5
## [1] 1253 455 1421 125 404 1927 2803 2099 639 1452 1414 1220 627 1164 970
## [16] 1107 1020 1872 1023 882 522 1768 664 569 582 888 1072 263 718 947
## [31] 2420 698 1159 712 691 806 636 782 671 1239 2395 1832 1599 1156 690
## [46] 1631 927 485 697 366 792 290 380 2040 778 1613 542 1363 1472 1722
## [61] 777 226 608 1225 617 551 1298 872 950 1134 2314 440 419 1599 1182
## [76] 1423 901 2931 494 1400 2391 783 363 1523 830 1335 517 1352 579 1143
## [91] 2455 1728 1625 1237 602 1636 899 1266 690 1533 1065 1303 1050 449 1162
## [106] 904 1324 738 284 690 2418 892 900 3056 774 1590 2807 914 843 1020
## [121] 575 1287 780 1306 2776 664 738 815 909 756 809 780 769 1525 508
## [136] 890 466 238 1610 190 120 1258 882 669 1260 591 485 672 2135 1219
## [151] 1035 1591 822 1050 1698 1010 2292 1044 346 1105 908 709 649 709 2103
## [166] 832 617 564 709 673 281 1090 885 506 908 1241 2369 1407 1239 1253
## [181] 1214 1234 2230 418 484 676 1343 1301 877 718 1044 513 895 669 1952
## [196] 172 819 734 1955 650 866 554 903 487 884 790 1810 2178 929 470
## [211] 821 734 208 1674 267 368 330 1368 983 708 428 1546 1134 891 1279
## [226] 332 1873 444 893 1537 2852 1211 1169 829 1658 890 402 1443 462 1370
## [241] 894 602 1034 1414 941 308 1437 449 464 705 388 771 1147 1532 647
## [256] 913 635 853 2211 927 1050 588 627 1316 1203 421 1350 1421 599 1050
## [271] 2353 610 1068 1406 985 590 2211 779 813 1488 1230 1076 595 1079 461
## [286] 635 284 233 482 317 1272 1028 659 605 738 1312 338 1250 659 888
## [301] 1217 1738 396 559 979 1185 292 240 554 277 1398 545 879 723 1054
## [316] 768 478 829 1441 717 1308 1478 489 574 1632 1755 446 1018 1841 716
## [331] 1216 716 859 843 758 523 1483 1014 906 852 692 551 778 2371 2228
## [346] 1344 596 1800 777 1009 228 1204 1258 1290 1161 523 285 662 1186 777
## [361] 1722 259 1522 719 707 738 807 1336 968 1159 1192 2565 1007 535 882
## [376] 1146 892 1590 1208 962 19 820 628 763 256 1638 1275 897 813 467
## [391] 1216 1350 1226 243 657 771 492 1411 1246 1034 808 891 452 934 507
## [406] 2559 895 735 1736 2095 972 446 529 354 1223 454 1498 1293 1135 731
## [421] 840 881 1582 918 567 1210 95 978 1160 1458 1021 892 65 1275 1270
## [436] 744 1214 981 948 245 385 782 1432 1294 836 555 655 453 1384 1467
## [451] 836 1374 1308 1047 547 616 1039
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr21_43094789_43095437 #SJ Intron 5 to 6
## [1] 652 340 757 224 196 677 721 788 396 439 832 694 413 387 308
## [16] 590 888 875 678 530 169 715 423 569 548 419 629 244 481 878
## [31] 666 464 769 426 284 560 415 378 471 520 628 1060 1428 709 515
## [46] 882 427 284 290 282 475 142 236 642 350 487 336 1377 758 1076
## [61] 402 93 469 445 395 383 501 579 656 728 1127 336 361 858 801
## [76] 704 388 1290 306 773 606 432 231 892 445 433 227 970 568 440
## [91] 714 493 745 770 295 720 643 817 385 619 528 386 817 318 423
## [106] 869 829 442 178 319 987 428 401 808 573 858 702 836 322 754
## [121] 275 1259 493 574 1144 380 500 435 443 399 453 365 432 425 287
## [136] 579 288 104 948 145 63 684 390 466 792 237 272 363 994 491
## [151] 361 1077 476 662 662 541 1163 512 193 433 501 689 490 300 728
## [166] 856 287 305 474 426 260 607 232 302 373 1236 685 723 687 866
## [181] 470 916 745 244 207 429 393 767 556 370 342 246 422 475 993
## [196] 181 446 516 945 637 515 391 447 492 504 510 650 1170 567 526
## [211] 266 677 140 468 330 408 252 685 556 523 255 1003 486 871 584
## [226] 328 730 304 563 625 1838 477 657 598 581 386 372 878 312 533
## [241] 461 633 487 408 435 226 1046 374 248 329 229 468 698 827 388
## [256] 971 354 828 828 627 572 339 362 669 438 258 545 980 406 673
## [271] 1611 389 682 407 487 496 630 512 480 473 648 763 613 465 327
## [286] 558 295 113 199 248 516 572 496 427 414 1110 274 748 428 473
## [301] 1039 716 249 255 499 649 188 153 420 197 739 240 431 499 1029
## [316] 537 346 425 993 390 695 606 317 371 869 631 263 390 655 360
## [331] 1161 526 422 492 751 347 640 676 483 570 487 306 382 1242 1710
## [346] 446 534 513 306 728 211 530 1384 627 766 301 172 626 582 301
## [361] 518 134 773 406 521 513 533 682 292 1074 1049 1059 684 366 388
## [376] 632 483 840 636 687 9 577 420 459 117 601 784 593 576 322
## [391] 712 679 450 153 428 508 348 603 771 541 766 243 269 609 304
## [406] 837 592 319 524 727 817 302 302 223 745 416 399 386 332 206
## [421] 271 727 793 516 284 491 77 407 565 779 378 566 80 634 330
## [436] 773 711 605 631 273 225 625 958 628 762 361 459 456 1132 429
## [451] 372 669 561 720 214 320 372
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr21_43094564_43094654 #SJ Intron 6 to 7
## [1] 576 706 653 1638 470 707 754 754 378 587 763 594 594 926 343
## [16] 559 793 756 993 529 463 715 487 1431 897 535 713 295 780 723
## [31] 713 636 692 425 350 759 419 584 534 785 582 953 1152 676 701
## [46] 1065 494 707 372 275 679 523 288 802 491 511 732 1475 820 916
## [61] 702 523 380 719 364 549 479 641 558 653 912 412 680 798 899
## [76] 674 952 1151 762 689 693 392 829 802 685 480 606 873 739 381
## [91] 812 732 1207 1003 516 632 575 709 389 611 679 465 712 455 426
## [106] 770 978 525 178 349 935 400 686 802 553 782 741 727 433 664
## [121] 527 1346 549 562 1131 774 653 377 433 642 444 400 399 496 336
## [136] 518 434 318 877 368 345 624 1055 571 631 502 286 805 993 532
## [151] 347 1334 677 637 618 519 1025 515 259 587 456 784 398 546 816
## [166] 735 494 497 463 470 597 612 351 628 367 1422 781 642 697 881
## [181] 452 783 752 361 596 449 471 984 519 501 478 530 761 525 883
## [196] 432 446 459 780 561 499 718 700 991 486 509 708 1036 557 682
## [211] 444 741 369 460 925 634 337 641 571 719 599 1152 511 868 614
## [226] 501 748 378 513 572 1671 724 1036 707 733 430 994 740 293 505
## [241] 428 698 463 428 415 827 1392 341 236 496 583 648 641 707 387
## [256] 896 474 718 817 587 545 410 442 666 493 676 545 882 501 720
## [271] 2183 531 613 428 470 432 682 432 563 482 616 796 717 699 627
## [286] 497 586 346 259 324 531 586 730 441 422 836 344 682 506 408
## [301] 835 986 434 258 489 560 296 407 556 349 605 246 505 570 861
## [316] 487 442 512 1092 666 825 563 576 420 844 722 201 438 761 419
## [331] 1003 589 684 525 728 358 598 638 559 483 747 312 300 1236 1460
## [346] 491 562 516 423 642 532 599 1364 833 1179 640 494 756 687 497
## [361] 551 540 867 730 694 450 470 999 585 816 911 1155 706 756 842
## [376] 707 462 941 611 597 15 665 443 764 594 620 955 538 523 297
## [391] 769 564 474 560 511 469 328 528 697 421 648 509 418 630 401
## [406] 814 647 525 639 746 688 418 253 418 727 680 547 425 348 541
## [421] 445 619 1090 838 295 560 317 460 557 688 459 551 70 614 361
## [436] 915 724 840 498 594 536 683 761 698 791 409 572 803 915 446
## [451] 364 610 542 728 428 449 397
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonEx4_5 <- (GeneSJ$chr21_43095537_43095693)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonEx5_6 <- (GeneSJ$chr21_43094789_43095437)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonEx6_7 <- (GeneSJ$chr21_43094564_43094654)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_Ex6DG <- (GeneSJ$chr21_43094564_43094666)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_ES5 <- (GeneSJ$chr21_43094789_43095693)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical SJ:
Splicing alterations:
Canonical SJ:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "WT"]
## W = 0.3147, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.0007982023
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.08409014 0.11040236 0.04923441
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_Ex6DG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.08329194 0.10960415 0.04843620
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:39] = -0.0007982, 0.0034493, 0.0041878, ..., 0.016684, 0.017392
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 1 1 1
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_Ex6DG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1- MUT_df$ECDF
MUT_df$Prediction <- "Donor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr21:43094564-43094666"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality:
shapiro.test(SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "WT"]
## W = 0.3147, p-value < 2.2e-16
shapiro.test(SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "MUT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_Ex6DG[SJCounts$GROUP == "MUT"]
## W = 0.99354, p-value = 0.8463
Variances:
res.ftest <- var.test(Normalized_Ex6DG ~ GROUP, SJCounts, alternative = "two.sided", conf.level=0.95)
res.ftest
##
## F test to compare two variances
##
## data: Normalized_Ex6DG by GROUP
## F = 119.25, num df = 2, denom df = 453, p-value < 2.2e-16
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 32.06533 4710.00312
## sample estimates:
## ratio of variances
## 119.2536
Welch Two Sample t-test:
t <- t.test(Normalized_Ex6DG ~ GROUP, data=SJCounts,
alternative="two.sided")
t
##
## Welch Two Sample t-test
##
## data: Normalized_Ex6DG by GROUP
## t = 4.5409, df = 2.0002, p-value = 0.04522
## alternative hypothesis: true difference in means between group MUT and group WT is not equal to 0
## 95 percent confidence interval:
## 0.004228648 0.156659550
## sample estimates:
## mean in group MUT mean in group WT
## 0.0812423016 0.0007982023
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"]
## W = 0.90443, p-value = 2.792e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 5.395745
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 4.473596 4.882237 3.702427
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx6_7 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.9221498 -0.5135080 -1.6933182
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:454] = -2.4939, -2.4202, -2.2311, ..., 5.4026, 10.996
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.22687225 0.37665198 0.05506608
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx6_7")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr21:43094564-43094654"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality:
shapiro.test(SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"]
## W = 0.90443, p-value = 2.792e-16
shapiro.test(SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "MUT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "MUT"]
## W = 0.96949, p-value = 0.6647
Variances:
res.ftest <- var.test(Normalized_CanonEx6_7 ~ GROUP, SJCounts, alternative = "two.sided", conf.level=0.95)
res.ftest
##
## F test to compare two variances
##
## data: Normalized_CanonEx6_7 by GROUP
## F = 0.21711, num df = 2, denom df = 453, p-value = 0.3902
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.05837787 8.57499090
## sample estimates:
## ratio of variances
## 0.2171121
Two Sample t-test:
t <- t.test(Normalized_CanonEx6_7 ~ GROUP, data=SJCounts,
alternative="two.sided",
var.equal=TRUE)
t
##
## Two Sample t-test
##
## data: Normalized_CanonEx6_7 by GROUP
## t = -1.4028, df = 455, p-value = 0.1614
## alternative hypothesis: true difference in means between group MUT and group WT is not equal to 0
## 95 percent confidence interval:
## -2.5041422 0.4181582
## sample estimates:
## mean in group MUT mean in group WT
## 4.352753 5.395745
Normality Test:
shapiro.test(SJCounts$Normalized_ES5[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES5[SJCounts$GROUP == "WT"]
## W = 0.74617, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_ES5[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.7802697
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_ES5[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.09249916 0.19627085 0.11816257
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_ES5 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.6877706 -0.5839989 -0.6621072
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:438] = -0.78027, -0.7648, -0.76276, ..., 3.8119, 5.9144
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.1806167 0.4008811 0.2621145
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_ES5")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr21:43094789-43095693"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality:
shapiro.test(SJCounts$Normalized_ES5[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES5[SJCounts$GROUP == "WT"]
## W = 0.74617, p-value < 2.2e-16
shapiro.test(SJCounts$Normalized_ES5[SJCounts$GROUP == "MUT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES5[SJCounts$GROUP == "MUT"]
## W = 0.92154, p-value = 0.4578
Variances:
res.ftest <- var.test(Normalized_ES5 ~ GROUP, SJCounts, alternative = "two.sided", conf.level=0.95)
res.ftest
##
## F test to compare two variances
##
## data: Normalized_ES5 by GROUP
## F = 0.0029752, num df = 2, denom df = 453, p-value = 0.005941
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.0007999698 0.1175057212
## sample estimates:
## ratio of variances
## 0.002975154
Welch Two Sample t-test:
t <- t.test(Normalized_ES5 ~ GROUP, data=SJCounts,
alternative="less",
var.equal=FALSE)
t
##
## Welch Two Sample t-test
##
## data: Normalized_ES5 by GROUP
## t = -11.51, df = 20.308, p-value = 1.176e-10
## alternative hypothesis: true difference in means between group MUT and group WT is less than 0
## 95 percent confidence interval:
## -Inf -0.5481032
## sample estimates:
## mean in group MUT mean in group WT
## 0.1356442 0.7802697
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"]
## W = 0.96736, p-value = 1.629e-08
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 8.08144
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 10.250589 7.961237 10.979272
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx4_5 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 2.1691485 -0.1202036 2.8978322
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:454] = -6.8306, -6.0433, -5.7787, ..., 3.8771, 3.9066
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.8678414 0.4030837 0.9405286
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx4_5")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Canonical Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr21:43095537-43095693"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
`
Normality:
shapiro.test(SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "WT"]
## W = 0.96736, p-value = 1.629e-08
shapiro.test(SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "MUT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx4_5[SJCounts$GROUP == "MUT"]
## W = 0.91816, p-value = 0.4459
Variances:
res.ftest <- var.test(Normalized_CanonEx4_5 ~ GROUP, SJCounts, alternative = "two.sided", conf.level=0.95)
res.ftest
##
## F test to compare two variances
##
## data: Normalized_CanonEx4_5 by GROUP
## F = 0.57596, num df = 2, denom df = 453, p-value = 0.8748
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.1548666 22.7480037
## sample estimates:
## ratio of variances
## 0.5759618
Two Sample t-test:
t <- t.test(Normalized_CanonEx4_5 ~ GROUP, data=SJCounts,
alternative="two.sided",
var.equal=TRUE)
t
##
## Two Sample t-test
##
## data: Normalized_CanonEx4_5 by GROUP
## t = 1.3731, df = 455, p-value = 0.1704
## alternative hypothesis: true difference in means between group MUT and group WT is not equal to 0
## 95 percent confidence interval:
## -0.7110516 4.0089030
## sample estimates:
## mean in group MUT mean in group WT
## 9.730366 8.081440
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "WT"]
## W = 0.99126, p-value = 0.008802
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 4.531591
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 4.128826 6.010795 3.667963
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx5_6 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.4027648 1.4792040 -0.8636277
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:454] = -3.298, -3.2718, -3.066, ..., 4.387, 4.6644
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.3854626 0.9030837 0.2665198
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx5_6")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Canonical Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr21:43094789-43095437"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality:
shapiro.test(SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "WT"]
## W = 0.99126, p-value = 0.008802
shapiro.test(SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "MUT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx5_6[SJCounts$GROUP == "MUT"]
## W = 0.89075, p-value = 0.3566
Variances:
res.ftest <- var.test(Normalized_CanonEx5_6 ~ GROUP, SJCounts, alternative = "two.sided", conf.level=0.95)
res.ftest
##
## F test to compare two variances
##
## data: Normalized_CanonEx5_6 by GROUP
## F = 0.88354, num df = 2, denom df = 453, p-value = 0.8281
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.2375683 34.8958559
## sample estimates:
## ratio of variances
## 0.883536
Two Sample t-test:
t <- t.test(Normalized_CanonEx5_6 ~ GROUP, data=SJCounts,
alternative="two.sided",
var.equal=TRUE)
t
##
## Two Sample t-test
##
## data: Normalized_CanonEx5_6 by GROUP
## t = 0.092767, df = 455, p-value = 0.9261
## alternative hypothesis: true difference in means between group MUT and group WT is not equal to 0
## 95 percent confidence interval:
## -1.431801 1.573676
## sample estimates:
## mean in group MUT mean in group WT
## 4.602528 4.531591
Variant found in 3 patients of the BeatAML (3 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"WT1_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="WT1" & found_variants$MutationKey_Hg38 == "chr11,32396364,G,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 3bp from the variant position 32396364, chr11,32396366
Show all the splice junctions containing positions between 32396360 - 32396369
colnames(GeneSJ)[grep("3239636",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr11:32392756-32399947
Show all the splice junctions containing the position 32392756-32399947
colnames(GeneSJ)[grep("32392756_32399947",colnames(GeneSJ))]
## [1] "chr11_32392756_32399947"
Found: chr11_32392756_32399947
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr11_32392756_32399947
## [1] 0 5 1 0 0 2 0 2 0 0 1 0 2 0 1 1 4 1 1 0 1 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1
## [38] 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 1 0 0 0 0 1 4 0 1 0 0 0 0 2 0 0 0 0 0 8 0 1
## [75] 0 0 2 0 1 0 2 0 1 8 0 1 0 1 0 0 1 1 0 3 2 0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 0
## [112] 0 1 0 0 1 0 1 0 0 0 0 0 0 2 0 0 0 0 2 0 1 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0
## [149] 1 0 0 0 2 2 0 1 4 1 0 0 0 1 0 1 3 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 2 2 0 0 0
## [186] 0 0 0 1 0 0 0 1 0 1 0 1 0 2 3 1 0 0 0 0 0 1 3 0 0 3 0 0 0 1 0 0 0 3 0 0 5
## [223] 0 0 3 0 0 0 0 0 0 3 0 0 1 0 0 0 0 4 0 0 1 1 0 0 4 0 0 1 5 0 0 0 0 1 0 0 0
## [260] 1 0 0 0 1 0 7 0 2 0 0 0 0 0 0 0 1 1 0 0 0 2 0 1 2 5 0 4 0 0 0 0 0 0 0 0 0
## [297] 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 2 1 3 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 5 1 0
## [334] 0 6 0 0 1 1 0 0 0 0 2 0 0 2 1 1 0 2 0 0 1 0 0 0 0 1 2 0 0 0 0 0 0 1 1 1 0
## [371] 0 0 0 0 3 0 0 0 0 0 0 3 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 1 0 1 0 0
## [408] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 8 0 1 2 0 0 2 1 1 0 0 0 0 2 1 1 0 0 4 0 1 0
## [445] 0 0 0 0 2 0 0 1 1 0 2 0 0
Samples with the SJ of interest:
table(GeneSJ$chr11_32392756_32399947>0)
##
## FALSE TRUE
## 313 144
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr11_32392756_32399947 > 0])
##
## MUT WT
## 1 143
Alternative SJ found in the mutated samples.
Exon 7-6: chr11:32396408-32399947; aceptor splice site: chr11:32396408
Exon 8-7: chr11:32392756-32396256
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr11_32396408_32399947
## [1] 66 47 7 3 2 20 106 161 0 23 9 0 101 10 84 47 101 120
## [19] 57 30 55 50 30 43 0 69 16 2 0 79 1 64 0 36 0 25
## [37] 20 6 0 3 0 151 0 110 16 1 5 9 14 1 64 0 6 63
## [55] 4 6 1 54 63 58 42 3 0 9 0 120 13 3 12 0 6 14
## [73] 82 134 45 9 32 3 51 19 162 11 10 172 0 29 60 68 0 3
## [91] 51 69 4 262 26 0 67 102 6 105 98 51 1 22 120 16 68 20
## [109] 18 0 5 28 63 30 0 80 183 103 1 14 2 3 39 6 35 6
## [127] 6 0 0 80 3 30 2 50 7 15 44 1 49 2 0 0 62 3
## [145] 10 1 3 42 78 47 0 0 96 100 2 57 169 33 10 16 12 58
## [163] 2 68 68 0 0 40 43 2 0 6 28 29 22 2 0 54 7 1
## [181] 50 138 44 0 0 2 0 0 107 6 1 6 10 39 122 5 103 49
## [199] 75 94 161 1 2 51 0 23 213 159 21 17 51 0 13 5 4 0
## [217] 6 103 72 86 0 66 3 31 95 8 0 0 34 1 3 144 6 0
## [235] 68 23 0 104 10 32 7 2 53 55 52 42 89 2 4 97 60 22
## [253] 51 7 41 8 0 0 77 22 37 13 0 103 0 124 203 153 30 55
## [271] 13 0 3 0 11 17 20 21 4 30 114 15 24 46 22 0 22 1
## [289] 0 12 81 103 0 15 0 55 19 8 9 10 54 15 10 0 30 8
## [307] 0 27 20 0 5 34 7 27 16 3 59 4 10 83 0 7 4 29
## [325] 88 163 2 12 29 0 107 73 76 1 110 31 33 113 35 0 0 0
## [343] 30 120 25 61 28 73 54 55 49 29 41 128 5 0 18 7 73 53
## [361] 7 3 0 18 0 9 55 17 51 35 5 2 0 1 72 0 3 31
## [379] 1 101 5 49 0 37 0 124 7 9 25 9 197 42 22 0 4 107
## [397] 0 10 0 0 77 0 6 10 47 43 10 17 34 23 3 3 2 5
## [415] 14 3 7 12 0 28 25 27 324 39 20 51 0 7 111 57 216 24
## [433] 0 12 0 56 27 41 87 0 122 82 52 17 12 7 0 5 156 56
## [451] 30 66 78 12 80 1 0
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr11_32392756_32396256
## [1] 61 55 2 2 0 34 83 151 0 16 11 0 103 7 59 47 58 92
## [19] 49 26 68 63 27 49 2 62 16 3 1 68 4 61 0 41 0 47
## [37] 27 4 0 4 0 99 0 96 13 0 1 7 16 0 59 0 5 30
## [55] 8 9 0 47 47 39 29 4 1 3 0 100 3 2 21 0 9 38
## [73] 102 82 41 9 31 8 48 17 117 11 12 113 0 21 58 54 0 4
## [91] 33 71 4 222 41 0 68 105 15 98 73 50 0 47 86 22 109 14
## [109] 12 0 2 17 75 28 1 83 143 99 1 18 5 4 28 7 20 1
## [127] 6 0 0 71 1 32 3 45 2 8 32 3 47 3 0 6 74 2
## [145] 13 1 2 41 79 30 0 0 100 130 4 62 168 50 3 11 12 73
## [163] 1 85 66 0 0 60 30 4 0 5 41 28 9 2 4 60 4 3
## [181] 68 126 25 0 0 1 0 0 89 10 4 4 14 33 99 5 71 49
## [199] 39 66 119 0 7 38 0 15 127 122 11 13 43 0 11 1 1 2
## [217] 7 79 65 76 0 69 2 33 99 9 0 0 25 2 0 168 2 0
## [235] 52 18 0 129 7 44 9 3 69 52 55 45 98 0 1 81 77 25
## [253] 42 3 47 4 0 2 75 19 34 17 1 93 0 113 138 124 33 66
## [271] 11 4 2 0 10 12 17 21 0 15 72 23 23 72 25 1 34 1
## [289] 0 18 73 136 0 9 0 46 26 33 22 17 57 25 7 0 24 14
## [307] 0 27 21 0 4 31 5 27 27 2 57 0 7 100 2 3 8 27
## [325] 53 93 0 16 26 0 97 81 82 3 99 27 30 95 37 0 4 0
## [343] 26 117 24 55 21 59 44 50 73 31 30 149 5 0 17 7 75 40
## [361] 18 5 0 11 0 6 61 23 46 31 7 2 0 1 71 0 8 26
## [379] 0 84 2 42 3 41 0 105 4 4 33 6 174 31 21 1 1 99
## [397] 0 11 0 0 58 0 21 9 43 44 10 18 27 22 2 11 4 13
## [415] 11 5 9 14 0 29 23 12 374 35 28 79 0 0 94 35 120 32
## [433] 0 13 1 41 22 53 76 0 141 90 44 7 85 2 0 3 134 47
## [451] 25 72 87 30 93 2 0
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonEx6_7 <- (GeneSJ$chr11_32396408_32399947)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonEx7_8 <- (GeneSJ$chr11_32392756_32396256)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_ES7 <- (GeneSJ$chr11_32392756_32399947)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical SJ:
Splicing alterations:
Canonical splice junction: Exon 6-7: chr11:32396408-32399947; aceptor splice site: chr11:32396408
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"]
## W = 0.67926, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 10.21563
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx6_7[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 8.379888
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx6_7 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -1.835745
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:322] = -10.216, -8.1954, -8.0847, ..., 29.784, 89.784
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.2873832
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx6_7")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr11:32396408-32399947"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_ES7[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES7[SJCounts$GROUP == "WT"]
## W = 0.43233, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_ES7[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.1515082
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_ES7[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.2793296
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_ES7 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.1278214
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:139] = -0.15151, -0.09786, -0.088257, ..., 3.1818, 4.0152
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.8481308
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_ES7")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1- MUT_df$ECDF
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr11:32392756-32399947"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"WT1_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="WT1" & found_variants$MutationKey_Hg38 == "chr11,32392696,G,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr11:32392065-32396256
Show all the splice junctions containing the position chr11:32392065-32396256
colnames(GeneSJ)[grep("32392065_32396256",colnames(GeneSJ))]
## [1] "chr11_32392065_32396256"
Found: chr11_32392065_32396256
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr11_32392065_32396256
## [1] 0 1 0 0 1 0 1 1 0 0 0 0 3 1 0 0 1 1 1 1 1 2 1 1 0
## [26] 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 2 0 4 1 0 0 0 0 0
## [51] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 1 1 0
## [76] 1 0 0 0 0 1 0 0 0 0 1 0 2 0 0 1 3 0 4 0 0 0 0 0 0
## [101] 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 1 0 0
## [126] 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
## [151] 0 0 0 0 0 2 2 1 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0
## [176] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 0
## [201] 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 2 0
## [226] 0 0 0 2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1
## [251] 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 2 4 0 0 0 0 0 0 0
## [276] 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0
## [301] 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
## [326] 1 0 0 0 0 0 0 1 0 2 0 1 0 2 0 0 0 0 1 0 0 1 0 0 0
## [351] 0 0 1 1 0 0 2 0 2 1 0 1 0 0 0 2 2 0 1 1 0 0 0 0 1
## [376] 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 1 0 0 0 0
## [401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 4 0 0
## [426] 1 0 0 0 1 3 0 0 2 0 1 0 0 1 0 2 0 1 0 0 0 0 0 0 0
## [451] 0 0 1 0 4 0 0
Samples with the SJ of interest:
table(GeneSJ$chr11_32392065_32396256>0)
##
## FALSE TRUE
## 360 97
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr11_32392065_32396256 > 0])
##
## MUT WT
## 1 96
Alternative SJ found in the mutated samples.
Exon upstream (UE): chr11:32392756-32396256
Exon downstream (DE): chr11:32392065-32392665; donor splice site chr11:32392665
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr11_32392756_32396256
## [1] 61 55 2 2 0 34 83 151 0 16 11 0 103 7 59 47 58 92
## [19] 49 26 68 63 27 49 2 62 16 3 1 68 4 61 0 41 0 47
## [37] 27 4 0 4 0 99 0 96 13 0 1 7 16 0 59 0 5 30
## [55] 8 9 0 47 47 39 29 4 1 3 0 100 3 2 21 0 9 38
## [73] 102 82 41 9 31 8 48 17 117 11 12 113 0 21 58 54 0 4
## [91] 33 71 4 222 41 0 68 105 15 98 73 50 0 47 86 22 109 14
## [109] 12 0 2 17 75 28 1 83 143 99 1 18 5 4 28 7 20 1
## [127] 6 0 0 71 1 32 3 45 2 8 32 3 47 3 0 6 74 2
## [145] 13 1 2 41 79 30 0 0 100 130 4 62 168 50 3 11 12 73
## [163] 1 85 66 0 0 60 30 4 0 5 41 28 9 2 4 60 4 3
## [181] 68 126 25 0 0 1 0 0 89 10 4 4 14 33 99 5 71 49
## [199] 39 66 119 0 7 38 0 15 127 122 11 13 43 0 11 1 1 2
## [217] 7 79 65 76 0 69 2 33 99 9 0 0 25 2 0 168 2 0
## [235] 52 18 0 129 7 44 9 3 69 52 55 45 98 0 1 81 77 25
## [253] 42 3 47 4 0 2 75 19 34 17 1 93 0 113 138 124 33 66
## [271] 11 4 2 0 10 12 17 21 0 15 72 23 23 72 25 1 34 1
## [289] 0 18 73 136 0 9 0 46 26 33 22 17 57 25 7 0 24 14
## [307] 0 27 21 0 4 31 5 27 27 2 57 0 7 100 2 3 8 27
## [325] 53 93 0 16 26 0 97 81 82 3 99 27 30 95 37 0 4 0
## [343] 26 117 24 55 21 59 44 50 73 31 30 149 5 0 17 7 75 40
## [361] 18 5 0 11 0 6 61 23 46 31 7 2 0 1 71 0 8 26
## [379] 0 84 2 42 3 41 0 105 4 4 33 6 174 31 21 1 1 99
## [397] 0 11 0 0 58 0 21 9 43 44 10 18 27 22 2 11 4 13
## [415] 11 5 9 14 0 29 23 12 374 35 28 79 0 0 94 35 120 32
## [433] 0 13 1 41 22 53 76 0 141 90 44 7 85 2 0 3 134 47
## [451] 25 72 87 30 93 2 0
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr11_32392065_32392665
## [1] 74 69 7 4 0 60 123 187 0 19 9 0 133 14 78 47 77 109
## [19] 53 50 131 80 52 47 0 75 20 0 3 86 8 71 0 37 0 47
## [37] 30 8 0 7 2 143 0 88 19 0 1 10 17 0 85 1 4 43
## [55] 5 9 1 55 95 51 65 8 1 8 0 120 13 2 22 0 2 77
## [73] 104 134 45 14 44 8 56 35 211 14 31 182 1 49 107 66 0 7
## [91] 52 108 4 344 80 0 63 109 10 132 132 63 0 64 129 29 134 23
## [109] 27 0 9 35 89 38 0 92 218 111 0 18 1 3 52 17 31 7
## [127] 11 0 1 116 4 46 3 85 3 18 61 2 22 2 0 7 110 2
## [145] 17 1 1 57 120 57 3 0 152 153 6 76 227 39 5 19 10 79
## [163] 2 136 100 0 0 90 34 4 0 7 60 52 13 2 3 81 8 6
## [181] 81 159 46 0 0 1 0 3 98 8 4 5 19 59 129 5 95 47
## [199] 74 88 136 2 8 52 0 21 206 150 22 19 61 0 10 1 3 0
## [217] 9 113 81 115 0 101 14 45 50 6 0 0 34 1 0 239 4 0
## [235] 104 27 0 168 13 41 12 5 62 85 58 57 149 1 2 122 100 26
## [253] 53 9 38 8 0 2 113 23 47 11 0 134 3 173 182 158 38 70
## [271] 17 2 1 0 12 14 27 19 3 26 101 28 32 116 31 0 60 5
## [289] 0 25 87 149 1 8 0 56 22 41 22 32 65 27 10 0 29 20
## [307] 0 33 25 0 4 42 3 41 14 2 74 2 11 174 5 4 13 39
## [325] 95 163 0 18 28 0 107 114 119 2 126 49 34 107 53 0 8 0
## [343] 32 141 28 105 19 70 80 46 68 41 56 201 4 1 28 9 109 61
## [361] 40 6 2 14 1 7 81 30 59 41 8 5 0 2 126 1 9 38
## [379] 0 75 0 54 4 62 0 130 3 2 32 11 233 44 28 1 2 90
## [397] 0 15 0 0 70 0 28 11 54 60 19 28 51 26 1 7 4 9
## [415] 18 4 21 21 2 52 44 24 484 51 36 134 0 9 143 68 216 34
## [433] 0 20 3 40 33 44 88 0 170 98 54 20 128 8 0 1 165 77
## [451] 27 66 122 33 146 1 0
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr11_32392065_32392665)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_ES <- (GeneSJ$chr11_32392065_32396256)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction Exon downstream (DE): chr11:32392065-32392665; donor splice site chr11:32392665
Splicing alterations:
Canonical splice junction Exon downstream (DE): chr11:32392065-32392665, donor splice site: chr11:32392665
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"]
## W = 0.57436, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 13.82591
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonDE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 14.72471
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonDE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.898805
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:328] = -13.826, -11.326, -10.493, ..., 52.841, 86.174
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.6985981
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonDE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr11:32392065-32392665"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_ES[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES[SJCounts$GROUP == "WT"]
## W = 0.11859, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_ES[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.1147521
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_ES[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.128041
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_ES - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.01328883
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:95] = -0.11475, -0.048527, -0.039621, ..., 3.0102, 14.171
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.8317757
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_ES")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1- MUT_df$ECDF
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr11:32392065-32396256"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"TP53_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="TP53" & found_variants$MutationKey_Hg38 == "chr17,7675238,T,C",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr17:7675216-7675993
Show all the splice junctions containing the position chr17:7675216-7675993
colnames(GeneSJ)[grep("7675216_7675993",colnames(GeneSJ))]
## [1] "chr17_7675216_7675993"
Found: chr17_7675216_7675993
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7675216_7675993
## [1] 1 0 0 0 0 2 3 1 0 0 3 1 2 0 1 2 0 4
## [19] 0 0 0 1 0 0 1 0 2 0 1 3 0 2 3 0 0 0
## [37] 1 1 1 0 2 0 2 4 3 1 0 0 1 1 0 0 3 3
## [55] 0 1 0 0 4 0 0 0 0 0 1 0 1 0 0 1 226 0
## [73] 1 2 0 1 1 7 0 2 4 0 0 3 0 3 0 3 1 1
## [91] 5 3 1 0 0 1 3 2 2 1 1 1 2 0 2 4 0 0
## [109] 0 0 0 0 1 4 0 3 4 0 2 3 1 1 2 2 0 0
## [127] 0 0 1 1 0 0 0 2 0 1 2 0 2 0 0 1 0 0
## [145] 1 0 1 2 2 2 6 1 3 5 1 1 1 3 0 0 2 0
## [163] 2 2 5 0 0 0 0 0 0 1 2 0 0 0 6 1 2 3
## [181] 5 1 6 1 0 1 1 1 0 0 3 0 1 3 1 1 1 1
## [199] 2 3 3 0 0 1 1 0 0 1 1 0 2 1 0 2 0 2
## [217] 0 1 2 0 0 0 2 0 1 1 1 0 0 1 1 1 0 0
## [235] 0 0 0 0 0 4 3 1 0 0 1 0 5 0 1 0 0 4
## [253] 3 0 1 1 1 1 0 3 1 0 0 2 4 0 1 0 1 0
## [271] 3 1 2 1 1 0 1 4 1 3 1 1 0 0 1 1 0 1
## [289] 1 0 2 0 0 1 2 4 0 3 0 0 1 1 0 2 1 4
## [307] 0 0 0 0 3 0 0 1 0 0 1 0 1 0 1 1 0 0
## [325] 0 6 0 0 0 0 2 0 1 2 0 0 1 0 1 0 0 0
## [343] 1 6 0 4 0 1 1 2 0 1 0 0 0 0 0 0 6 1
## [361] 3 1 0 0 0 3 2 3 2 0 2 0 0 2 0 2 0 1
## [379] 1 4 0 2 0 2 0 5 0 2 4 1 0 2 6 0 0 2
## [397] 2 3 4 2 3 0 0 0 0 3 0 0 1 2 0 0 1 0
## [415] 3 0 0 1 1 0 0 1 1 1 0 0 0 0 0 3 5 1
## [433] 0 3 2 0 0 2 1 0 0 0 3 0 1 0 1 0 4 1
## [451] 2 1 1 1 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr17_7675216_7675993>0)
##
## FALSE TRUE
## 207 250
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr17_7675216_7675993 > 0])
##
## MUT WT
## 1 249
Alternative SJ found in the mutated samples.
Search: chr17:7675234-7676033
Show all the splice junctions containing the positions between 7675230 and 7675239
colnames(GeneSJ)[grep("767523",colnames(GeneSJ))]
## [1] "chr17_7675234_7676033" "chr17_7675237_7675335" "chr17_7675237_7675349"
## [4] "chr17_7675237_7675602" "chr17_7675237_7675884" "chr17_7675237_7675993"
## [7] "chr17_7675237_7676193" "chr17_7675237_7676381" "chr17_7675237_7687376"
## [10] "chr17_7675239_7676001"
Found: chr17_7675234_7676033
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7675234_7676033
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [223] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [260] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [297] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [334] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [371] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [408] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [445] 0 0 0 0 0 0 0 0 0 0 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr17_7675234_7676033>0)
##
## FALSE TRUE
## 456 1
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr17_7675234_7676033 > 0])
##
## WT
## 1
Alternative SJ not found in the mutated samples of the splice junction collection.
Canonical DE: chr17:7674972-7675052
Canonical UE: chr17:7675237-7675993; aceptor splice site chr17:7675237
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674972_7675052
## [1] 141 24 108 9 10 90 187 146 72 53 165 160 39 11 90 106 169 255
## [19] 43 175 11 89 88 15 75 39 41 2 31 212 177 49 216 88 17 35
## [37] 122 27 47 37 88 32 219 139 67 93 115 7 57 46 42 13 39 113
## [55] 3 109 12 149 183 139 34 4 78 37 69 6 35 65 183 59 217 55
## [73] 20 148 94 125 25 295 31 160 151 91 8 185 33 116 16 161 61 50
## [91] 136 74 41 49 29 136 158 140 89 96 63 94 158 26 100 317 96 60
## [109] 6 10 111 72 42 60 58 207 193 127 57 167 12 203 163 152 17 21
## [127] 16 66 111 58 60 49 116 62 59 99 35 11 183 8 8 163 12 36
## [145] 281 14 37 32 233 117 96 168 50 213 80 81 196 55 14 5 98 45
## [163] 63 30 170 76 73 52 153 49 39 71 32 11 76 7 155 93 123 79
## [181] 144 71 132 51 21 31 51 17 123 10 44 27 34 77 208 19 119 95
## [199] 120 138 80 22 11 39 91 31 70 131 111 77 47 135 14 99 33 27
## [217] 33 160 105 23 24 41 148 48 153 35 75 36 123 90 71 80 41 74
## [235] 123 68 17 68 62 161 93 100 144 115 82 19 93 0 82 19 18 32
## [253] 134 127 46 146 32 150 173 126 133 32 42 156 84 15 110 189 11 92
## [271] 252 55 154 82 60 100 104 144 28 55 165 110 88 62 25 121 33 10
## [289] 18 20 117 48 65 81 66 225 18 189 31 221 71 12 13 56 99 113
## [307] 24 17 19 23 153 53 67 79 293 113 12 64 43 26 73 47 42 28
## [325] 105 163 102 64 64 16 259 54 46 59 87 1 98 110 57 130 59 9
## [343] 203 221 57 180 66 114 32 93 32 70 23 58 45 8 19 107 60 30
## [361] 98 7 15 12 70 150 129 121 19 240 332 10 62 8 8 62 81 18
## [379] 135 128 2 116 34 29 4 118 21 154 137 54 89 109 80 6 34 184
## [397] 71 126 213 114 322 13 45 97 33 165 88 12 70 98 293 24 59 10
## [415] 170 53 79 70 56 5 53 174 45 66 33 105 26 78 159 123 104 105
## [433] 0 78 53 49 185 39 176 21 17 15 160 132 163 79 30 27 366 96
## [451] 75 104 118 122 14 5 97
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7675237_7675993
## [1] 390 69 315 4 84 356 962 504 248 329 471 402 109 63 472
## [16] 300 293 660 167 492 103 417 267 21 143 92 152 4 86 356
## [31] 963 140 616 286 148 92 341 85 138 117 506 21 385 408 189
## [46] 289 287 25 195 172 142 51 106 561 20 550 40 238 549 339
## [61] 134 28 173 261 191 24 227 177 612 148 261 165 78 417 316
## [76] 354 79 894 108 492 909 257 38 614 119 835 77 481 136 143
## [91] 719 436 176 104 90 446 388 406 286 392 267 658 488 76 388
## [106] 637 295 204 18 12 306 338 131 350 182 583 1042 278 152 435
## [121] 44 384 452 514 56 101 38 205 336 151 173 154 307 329 176
## [136] 308 113 17 540 21 21 395 38 102 667 52 109 82 659 442
## [151] 332 477 136 522 304 316 588 168 61 57 332 78 180 90 1235
## [166] 178 211 149 415 146 66 159 173 36 229 8 942 328 387 144
## [181] 564 236 876 139 110 75 221 62 354 10 372 154 114 256 602
## [196] 37 356 280 435 316 197 55 23 67 261 97 463 387 321 150
## [211] 298 269 60 446 54 50 101 532 259 80 62 135 416 75 435
## [226] 78 260 118 382 331 174 277 141 227 818 189 19 188 203 612
## [241] 243 178 445 463 307 75 233 4 251 128 71 120 422 422 154
## [256] 264 111 245 593 339 396 134 135 484 588 40 363 518 48 260
## [271] 788 157 403 352 225 189 519 374 105 369 447 293 165 227 91
## [286] 213 69 56 145 35 437 145 189 132 171 316 52 553 94 637
## [301] 114 31 59 173 416 387 70 48 56 61 394 196 218 247 462
## [316] 357 72 143 125 143 176 115 95 80 352 1359 337 211 555 49
## [331] 390 116 115 170 172 5 315 321 214 271 218 38 586 748 166
## [346] 1241 134 497 232 277 34 222 51 155 125 47 45 193 195 245
## [361] 551 31 72 38 184 378 367 411 126 424 581 24 159 47 28
## [376] 170 228 43 419 436 2 366 101 89 11 361 92 373 414 153
## [391] 255 359 576 12 65 416 200 426 572 310 527 59 108 331 76
## [406] 912 208 34 331 375 541 54 189 19 506 96 410 367 280 42
## [421] 261 269 138 164 112 296 84 465 549 371 577 392 6 253 250
## [436] 101 572 102 418 27 41 23 416 439 237 176 81 64 654 622
## [451] 161 289 408 317 87 8 315
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr17_7674972_7675052)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonUE <- (GeneSJ$chr17_7675237_7675993)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_AG <- (GeneSJ$chr17_7675216_7675993)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction chr17:7675237-7675993, aceptor splice site chr17:7675237
Splicing alterations:
Canonical splice junction chr17:7675237-7675993, aceptor splice site chr17:7675237
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"]
## W = 0.85443, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 12.83588
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonUE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 5.270598
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonUE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -7.565282
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:453] = -10.017, -9.1322, -8.6692, ..., 11.746, 25.626
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.006578947
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonUE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr17:7675237-7675993"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_AG[SJCounts$GROUP == "WT"]
## W = 0.72791, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.05448304
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_AG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 4.563813
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_AG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 4.50933
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:247] = -0.054483, -0.034956, -0.034744, ..., 0.37748, 0.53203
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 1
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_AG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1- MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr17:7675216-7675993"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 4 patients of the BeatAML (4 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"TP53_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="TP53" & found_variants$MutationKey_Hg38 == "chr17,7675217,T,C",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr17:7675216-7675993
Show all the splice junctions containing the position chr17:7675216-7675993
colnames(GeneSJ)[grep("7675216_7675993",colnames(GeneSJ))]
## [1] "chr17_7675216_7675993"
Found: chr17_7675216_7675993
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7675216_7675993
## [1] 1 0 0 0 0 2 3 1 0 0 3 1 2 0 1 2 0 4
## [19] 0 0 0 1 0 0 1 0 2 0 1 3 0 2 3 0 0 0
## [37] 1 1 1 0 2 0 2 4 3 1 0 0 1 1 0 0 3 3
## [55] 0 1 0 0 4 0 0 0 0 0 1 0 1 0 0 1 226 0
## [73] 1 2 0 1 1 7 0 2 4 0 0 3 0 3 0 3 1 1
## [91] 5 3 1 0 0 1 3 2 2 1 1 1 2 0 2 4 0 0
## [109] 0 0 0 0 1 4 0 3 4 0 2 3 1 1 2 2 0 0
## [127] 0 0 1 1 0 0 0 2 0 1 2 0 2 0 0 1 0 0
## [145] 1 0 1 2 2 2 6 1 3 5 1 1 1 3 0 0 2 0
## [163] 2 2 5 0 0 0 0 0 0 1 2 0 0 0 6 1 2 3
## [181] 5 1 6 1 0 1 1 1 0 0 3 0 1 3 1 1 1 1
## [199] 2 3 3 0 0 1 1 0 0 1 1 0 2 1 0 2 0 2
## [217] 0 1 2 0 0 0 2 0 1 1 1 0 0 1 1 1 0 0
## [235] 0 0 0 0 0 4 3 1 0 0 1 0 5 0 1 0 0 4
## [253] 3 0 1 1 1 1 0 3 1 0 0 2 4 0 1 0 1 0
## [271] 3 1 2 1 1 0 1 4 1 3 1 1 0 0 1 1 0 1
## [289] 1 0 2 0 0 1 2 4 0 3 0 0 1 1 0 2 1 4
## [307] 0 0 0 0 3 0 0 1 0 0 1 0 1 0 1 1 0 0
## [325] 0 6 0 0 0 0 2 0 1 2 0 0 1 0 1 0 0 0
## [343] 1 6 0 4 0 1 1 2 0 1 0 0 0 0 0 0 6 1
## [361] 3 1 0 0 0 3 2 3 2 0 2 0 0 2 0 2 0 1
## [379] 1 4 0 2 0 2 0 5 0 2 4 1 0 2 6 0 0 2
## [397] 2 3 4 2 3 0 0 0 0 3 0 0 1 2 0 0 1 0
## [415] 3 0 0 1 1 0 0 1 1 1 0 0 0 0 0 3 5 1
## [433] 0 3 2 0 0 2 1 0 0 0 3 0 1 0 1 0 4 1
## [451] 2 1 1 1 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr17_7675216_7675993>0)
##
## FALSE TRUE
## 207 250
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr17_7675216_7675993 > 0])
##
## MUT WT
## 1 249
Alternative SJ found in the mutated samples.
Canonical DE: chr17:7674972-7675052
Canonical UE: chr17:7675237-7675993; aceptor splice site chr17:7675237
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674972_7675052
## [1] 141 24 108 9 10 90 187 146 72 53 165 160 39 11 90 106 169 255
## [19] 43 175 11 89 88 15 75 39 41 2 31 212 177 49 216 88 17 35
## [37] 122 27 47 37 88 32 219 139 67 93 115 7 57 46 42 13 39 113
## [55] 3 109 12 149 183 139 34 4 78 37 69 6 35 65 183 59 217 55
## [73] 20 148 94 125 25 295 31 160 151 91 8 185 33 116 16 161 61 50
## [91] 136 74 41 49 29 136 158 140 89 96 63 94 158 26 100 317 96 60
## [109] 6 10 111 72 42 60 58 207 193 127 57 167 12 203 163 152 17 21
## [127] 16 66 111 58 60 49 116 62 59 99 35 11 183 8 8 163 12 36
## [145] 281 14 37 32 233 117 96 168 50 213 80 81 196 55 14 5 98 45
## [163] 63 30 170 76 73 52 153 49 39 71 32 11 76 7 155 93 123 79
## [181] 144 71 132 51 21 31 51 17 123 10 44 27 34 77 208 19 119 95
## [199] 120 138 80 22 11 39 91 31 70 131 111 77 47 135 14 99 33 27
## [217] 33 160 105 23 24 41 148 48 153 35 75 36 123 90 71 80 41 74
## [235] 123 68 17 68 62 161 93 100 144 115 82 19 93 0 82 19 18 32
## [253] 134 127 46 146 32 150 173 126 133 32 42 156 84 15 110 189 11 92
## [271] 252 55 154 82 60 100 104 144 28 55 165 110 88 62 25 121 33 10
## [289] 18 20 117 48 65 81 66 225 18 189 31 221 71 12 13 56 99 113
## [307] 24 17 19 23 153 53 67 79 293 113 12 64 43 26 73 47 42 28
## [325] 105 163 102 64 64 16 259 54 46 59 87 1 98 110 57 130 59 9
## [343] 203 221 57 180 66 114 32 93 32 70 23 58 45 8 19 107 60 30
## [361] 98 7 15 12 70 150 129 121 19 240 332 10 62 8 8 62 81 18
## [379] 135 128 2 116 34 29 4 118 21 154 137 54 89 109 80 6 34 184
## [397] 71 126 213 114 322 13 45 97 33 165 88 12 70 98 293 24 59 10
## [415] 170 53 79 70 56 5 53 174 45 66 33 105 26 78 159 123 104 105
## [433] 0 78 53 49 185 39 176 21 17 15 160 132 163 79 30 27 366 96
## [451] 75 104 118 122 14 5 97
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7675237_7675993
## [1] 390 69 315 4 84 356 962 504 248 329 471 402 109 63 472
## [16] 300 293 660 167 492 103 417 267 21 143 92 152 4 86 356
## [31] 963 140 616 286 148 92 341 85 138 117 506 21 385 408 189
## [46] 289 287 25 195 172 142 51 106 561 20 550 40 238 549 339
## [61] 134 28 173 261 191 24 227 177 612 148 261 165 78 417 316
## [76] 354 79 894 108 492 909 257 38 614 119 835 77 481 136 143
## [91] 719 436 176 104 90 446 388 406 286 392 267 658 488 76 388
## [106] 637 295 204 18 12 306 338 131 350 182 583 1042 278 152 435
## [121] 44 384 452 514 56 101 38 205 336 151 173 154 307 329 176
## [136] 308 113 17 540 21 21 395 38 102 667 52 109 82 659 442
## [151] 332 477 136 522 304 316 588 168 61 57 332 78 180 90 1235
## [166] 178 211 149 415 146 66 159 173 36 229 8 942 328 387 144
## [181] 564 236 876 139 110 75 221 62 354 10 372 154 114 256 602
## [196] 37 356 280 435 316 197 55 23 67 261 97 463 387 321 150
## [211] 298 269 60 446 54 50 101 532 259 80 62 135 416 75 435
## [226] 78 260 118 382 331 174 277 141 227 818 189 19 188 203 612
## [241] 243 178 445 463 307 75 233 4 251 128 71 120 422 422 154
## [256] 264 111 245 593 339 396 134 135 484 588 40 363 518 48 260
## [271] 788 157 403 352 225 189 519 374 105 369 447 293 165 227 91
## [286] 213 69 56 145 35 437 145 189 132 171 316 52 553 94 637
## [301] 114 31 59 173 416 387 70 48 56 61 394 196 218 247 462
## [316] 357 72 143 125 143 176 115 95 80 352 1359 337 211 555 49
## [331] 390 116 115 170 172 5 315 321 214 271 218 38 586 748 166
## [346] 1241 134 497 232 277 34 222 51 155 125 47 45 193 195 245
## [361] 551 31 72 38 184 378 367 411 126 424 581 24 159 47 28
## [376] 170 228 43 419 436 2 366 101 89 11 361 92 373 414 153
## [391] 255 359 576 12 65 416 200 426 572 310 527 59 108 331 76
## [406] 912 208 34 331 375 541 54 189 19 506 96 410 367 280 42
## [421] 261 269 138 164 112 296 84 465 549 371 577 392 6 253 250
## [436] 101 572 102 418 27 41 23 416 439 237 176 81 64 654 622
## [451] 161 289 408 317 87 8 315
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr17_7674972_7675052)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonUE <- (GeneSJ$chr17_7675237_7675993)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_AG <- (GeneSJ$chr17_7675216_7675993)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction chr17:7675237-7675993, aceptor splice site chr17:7675237
Splicing alterations:
Canonical splice junction chr17:7675237-7675993, aceptor splice site chr17:7675237
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"]
## W = 0.85347, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 12.82575
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonUE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 18.781726 5.270598 11.630769 12.683578
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonUE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 5.9559749 -7.5551532 -1.1949817 -0.1421729
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:450] = -10.007, -9.122, -8.6591, ..., 11.757, 25.636
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.982339956 0.006622517 0.280353201 0.494481236
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonUE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr17:7675237-7675993"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_AG[SJCounts$GROUP == "WT"]
## W = 0.72981, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.05484385
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_AG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.000000 4.563813 0.000000 0.000000
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_AG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.05484385 4.50896875 -0.05484385 -0.05484385
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:247] = -0.054844, -0.035316, -0.035104, ..., 0.37712, 0.53167
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.4503311 1.0000000 0.4503311 0.4503311
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_AG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1- MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr17:7675216-7675993"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 6 patients of the BeatAML (7 samples).
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"TP53_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="TP53" & found_variants$MutationKey_Hg38 == "chr17,7674872,T,C",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 1bp from the variant position, chr17,7674872
Show all the splice junctions containing the positions between 7674870 - 7674879
colnames(GeneSJ)[grep("767487",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr17:7674291-7675052
Show all the splice junctions containing the position chr17:7674291-7675052
colnames(GeneSJ)[grep("7674291_7675052",colnames(GeneSJ))]
## [1] "chr17_7674291_7675052"
Found: chr17_7674291_7675052
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674291_7675052
## [1] 0 0 0 0 0 0 0 2 2 1 0 1 0 0 1 2 3 0 0 1 0 1 0 1 8
## [26] 0 0 0 4 1 1 3 2 1 0 1 1 2 1 0 0 4 1 3 0 1 1 1 1 0
## [51] 2 2 3 1 0 1 0 2 15 1 4 2 0 2 0 1 1 7 0 1 4 2 0 3 0
## [76] 0 1 2 1 1 3 0 1 3 1 0 0 0 3 0 4 0 2 2 4 1 0 0 2 1
## [101] 6 0 1 0 2 2 2 2 0 0 1 5 0 1 2 1 1 0 0 1 1 1 0 1 2
## [126] 0 2 1 0 4 2 2 0 2 0 0 3 0 1 1 0 0 1 1 0 2 1 4 1 1
## [151] 0 5 2 2 0 0 6 0 2 2 0 3 1 1 5 0 1 0 0 0 8 2 0 1 1
## [176] 1 1 0 0 0 2 0 5 3 1 0 4 1 1 0 0 5 0 1 2 3 2 1 1 0
## [201] 0 3 2 1 1 0 1 0 1 1 5 4 1 1 2 0 5 0 3 1 4 1 2 0 1
## [226] 1 1 1 4 0 0 1 2 3 4 1 1 0 2 0 1 2 5 1 1 3 4 0 2 1
## [251] 1 3 3 0 2 0 2 0 1 0 2 8 3 0 2 0 0 0 1 1 2 1 1 4 0
## [276] 0 1 1 0 0 2 2 4 2 1 1 5 0 0 1 2 13 0 0 0 1 0 0 0 0
## [301] 0 0 2 0 2 0 1 0 0 3 0 1 5 1 0 1 2 0 2 2 1 0 7 1 1
## [326] 2 0 0 0 0 5 1 0 0 3 0 0 1 2 0 1 1 1 0 0 2 2 1 0 2
## [351] 0 0 1 0 1 0 7 1 1 5 1 2 0 4 3 0 0 7 9 0 0 4 1 3 3
## [376] 2 2 1 0 0 0 1 1 2 0 0 1 1 0 0 1 0 1 0 0 0 1 0 2 0
## [401] 0 0 2 1 2 1 0 1 2 1 1 0 0 0 0 1 1 0 0 1 1 0 1 1 2
## [426] 5 15 1 1 0 0 2 0 0 1 0 0 1 0 2 0 0 0 1 0 0 0 0 1 1
## [451] 0 1 0 2 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr17_7674291_7675052>0)
##
## FALSE TRUE
## 179 278
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr17_7674291_7675052 > 0])
##
## MUT WT
## 4 274
Alternative SJ found in the mutated samples.
Exon upstream (UE): chr17:7674972-7675052
Exon downstream (DE): chr17:7674291-7674858
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674972_7675052
## [1] 141 24 108 9 10 90 187 146 72 53 165 160 39 11 90 106 169 255
## [19] 43 175 11 89 88 15 75 39 41 2 31 212 177 49 216 88 17 35
## [37] 122 27 47 37 88 32 219 139 67 93 115 7 57 46 42 13 39 113
## [55] 3 109 12 149 183 139 34 4 78 37 69 6 35 65 183 59 217 55
## [73] 20 148 94 125 25 295 31 160 151 91 8 185 33 116 16 161 61 50
## [91] 136 74 41 49 29 136 158 140 89 96 63 94 158 26 100 317 96 60
## [109] 6 10 111 72 42 60 58 207 193 127 57 167 12 203 163 152 17 21
## [127] 16 66 111 58 60 49 116 62 59 99 35 11 183 8 8 163 12 36
## [145] 281 14 37 32 233 117 96 168 50 213 80 81 196 55 14 5 98 45
## [163] 63 30 170 76 73 52 153 49 39 71 32 11 76 7 155 93 123 79
## [181] 144 71 132 51 21 31 51 17 123 10 44 27 34 77 208 19 119 95
## [199] 120 138 80 22 11 39 91 31 70 131 111 77 47 135 14 99 33 27
## [217] 33 160 105 23 24 41 148 48 153 35 75 36 123 90 71 80 41 74
## [235] 123 68 17 68 62 161 93 100 144 115 82 19 93 0 82 19 18 32
## [253] 134 127 46 146 32 150 173 126 133 32 42 156 84 15 110 189 11 92
## [271] 252 55 154 82 60 100 104 144 28 55 165 110 88 62 25 121 33 10
## [289] 18 20 117 48 65 81 66 225 18 189 31 221 71 12 13 56 99 113
## [307] 24 17 19 23 153 53 67 79 293 113 12 64 43 26 73 47 42 28
## [325] 105 163 102 64 64 16 259 54 46 59 87 1 98 110 57 130 59 9
## [343] 203 221 57 180 66 114 32 93 32 70 23 58 45 8 19 107 60 30
## [361] 98 7 15 12 70 150 129 121 19 240 332 10 62 8 8 62 81 18
## [379] 135 128 2 116 34 29 4 118 21 154 137 54 89 109 80 6 34 184
## [397] 71 126 213 114 322 13 45 97 33 165 88 12 70 98 293 24 59 10
## [415] 170 53 79 70 56 5 53 174 45 66 33 105 26 78 159 123 104 105
## [433] 0 78 53 49 185 39 176 21 17 15 160 132 163 79 30 27 366 96
## [451] 75 104 118 122 14 5 97
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674291_7674858
## [1] 401 85 346 5 45 318 647 433 220 197 388 393 82 35 270 310 242 692
## [19] 136 461 37 326 216 25 99 97 154 7 49 347 603 109 585 256 85 76
## [37] 289 77 125 102 371 76 407 356 178 279 297 16 152 148 95 49 108 287
## [55] 17 423 27 193 491 369 124 9 158 134 175 23 153 134 478 114 501 141
## [73] 66 428 285 344 77 803 78 456 540 234 28 535 90 472 47 374 112 117
## [91] 524 301 167 105 81 391 364 308 249 357 236 337 432 80 314 462 272 168
## [109] 15 16 306 257 118 223 128 517 731 243 152 360 49 358 371 463 2 78
## [127] 20 145 303 173 152 153 268 248 154 264 83 18 504 17 21 398 33 88
## [145] 556 43 105 79 628 403 297 388 132 464 226 303 618 175 40 28 276 80
## [163] 107 85 691 144 210 113 349 131 53 156 143 18 187 11 598 268 391 126
## [181] 444 177 473 132 64 72 218 65 334 27 141 62 85 195 593 33 299 253
## [199] 409 242 164 41 30 55 205 87 230 342 282 130 205 210 34 338 46 32
## [217] 85 469 262 70 45 101 379 81 427 57 183 80 316 307 151 225 110 211
## [235] 484 199 21 130 174 449 258 155 355 361 239 54 192 1 209 93 40 90
## [253] 359 389 148 254 102 230 507 309 344 99 108 499 338 21 330 454 30 244
## [271] 807 137 316 302 211 172 388 297 100 239 371 269 131 207 64 194 41 36
## [289] 66 45 359 96 137 138 148 349 31 496 72 512 100 21 69 162 376 368
## [307] 66 44 47 52 424 166 207 224 507 325 52 190 119 105 155 131 80 83
## [325] 301 656 231 185 320 57 419 118 116 130 132 1 236 281 195 233 198 25
## [343] 479 711 141 680 142 380 140 249 38 207 42 146 119 31 42 143 197 141
## [361] 341 26 65 34 140 306 334 395 73 380 633 29 155 28 19 162 201 36
## [379] 371 308 4 289 92 73 5 307 84 321 385 117 203 299 320 15 44 342
## [397] 170 360 510 296 475 25 109 259 89 702 198 47 263 263 492 34 133 24
## [415] 460 76 288 253 204 17 129 299 150 172 95 272 41 253 506 385 316 328
## [433] 2 232 184 80 495 97 335 33 28 24 374 427 254 155 88 47 544 386
## [451] 152 306 350 265 49 8 261
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonUE <- (GeneSJ$chr17_7674972_7675052)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr17_7674291_7674858)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_ES <- (GeneSJ$chr17_7674291_7675052)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Splicing alterations:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_ES[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES[SJCounts$GROUP == "WT"]
## W = 0.58035, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_ES[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.1275619
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_ES[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.12165450 0.00000000 0.29112082 0.09465215 0.02803476
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_ES - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.005907383 -0.127561884 0.163558931 -0.032909731 -0.099527121
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:269] = -0.12756, -0.1131, -0.1117, ..., 1.3909, 1.7266
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.7256637 0.3938053 0.8694690 0.6836283 0.4446903
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_ES")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- NA
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr17:7674291-7675052"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality:
shapiro.test(SJCounts$Normalized_ES[SJCounts$GROUP== "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES[SJCounts$GROUP == "WT"]
## W = 0.58035, p-value < 2.2e-16
shapiro.test(SJCounts$Normalized_ES[SJCounts$GROUP== "MUT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES[SJCounts$GROUP == "MUT"]
## W = 0.89878, p-value = 0.4032
Mann-Whitney:
wt <- wilcox.test(x=SJCounts$Normalized_ES[SJCounts$GROUP== "MUT"],
y=SJCounts$Normalized_ES[SJCounts$GROUP== "WT"],
alternative = "two.sided",
paired = FALSE,
conf.int = 0.95)
wt
##
## Wilcoxon rank sum test with continuity correction
##
## data: SJCounts$Normalized_ES[SJCounts$GROUP == "MUT"] and SJCounts$Normalized_ES[SJCounts$GROUP == "WT"]
## W = 1320, p-value = 0.5057
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -0.06114993 0.09467616
## sample estimates:
## difference in location
## 0.02797629
Variant found in 2 patients of the BeatAML (2 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"TP53_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="TP53" & found_variants$MutationKey_Hg38 == "chr17,7674894,G,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr17:7674291-7675052
Show all the splice junctions containing the position chr17:7674291-7675052
colnames(GeneSJ)[grep("7674291_7675052",colnames(GeneSJ))]
## [1] "chr17_7674291_7675052"
Found: chr17_7674291_7675052
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674291_7675052
## [1] 0 0 0 0 0 0 0 2 2 1 0 1 0 0 1 2 3 0 0 1 0 1 0 1 8
## [26] 0 0 0 4 1 1 3 2 1 0 1 1 2 1 0 0 4 1 3 0 1 1 1 1 0
## [51] 2 2 3 1 0 1 0 2 15 1 4 2 0 2 0 1 1 7 0 1 4 2 0 3 0
## [76] 0 1 2 1 1 3 0 1 3 1 0 0 0 3 0 4 0 2 2 4 1 0 0 2 1
## [101] 6 0 1 0 2 2 2 2 0 0 1 5 0 1 2 1 1 0 0 1 1 1 0 1 2
## [126] 0 2 1 0 4 2 2 0 2 0 0 3 0 1 1 0 0 1 1 0 2 1 4 1 1
## [151] 0 5 2 2 0 0 6 0 2 2 0 3 1 1 5 0 1 0 0 0 8 2 0 1 1
## [176] 1 1 0 0 0 2 0 5 3 1 0 4 1 1 0 0 5 0 1 2 3 2 1 1 0
## [201] 0 3 2 1 1 0 1 0 1 1 5 4 1 1 2 0 5 0 3 1 4 1 2 0 1
## [226] 1 1 1 4 0 0 1 2 3 4 1 1 0 2 0 1 2 5 1 1 3 4 0 2 1
## [251] 1 3 3 0 2 0 2 0 1 0 2 8 3 0 2 0 0 0 1 1 2 1 1 4 0
## [276] 0 1 1 0 0 2 2 4 2 1 1 5 0 0 1 2 13 0 0 0 1 0 0 0 0
## [301] 0 0 2 0 2 0 1 0 0 3 0 1 5 1 0 1 2 0 2 2 1 0 7 1 1
## [326] 2 0 0 0 0 5 1 0 0 3 0 0 1 2 0 1 1 1 0 0 2 2 1 0 2
## [351] 0 0 1 0 1 0 7 1 1 5 1 2 0 4 3 0 0 7 9 0 0 4 1 3 3
## [376] 2 2 1 0 0 0 1 1 2 0 0 1 1 0 0 1 0 1 0 0 0 1 0 2 0
## [401] 0 0 2 1 2 1 0 1 2 1 1 0 0 0 0 1 1 0 0 1 1 0 1 1 2
## [426] 5 15 1 1 0 0 2 0 0 1 0 0 1 0 2 0 0 0 1 0 0 0 0 1 1
## [451] 0 1 0 2 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr17_7674291_7675052>0)
##
## FALSE TRUE
## 179 278
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr17_7674291_7675052 > 0])
##
## MUT WT
## 1 277
Alternative SJ found in the mutated samples.
Exon upstream (UE): chr17:7674972-7675052
Exon downstream (DE): chr17:7674291-7674858, splice site chr17:7674858
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674972_7675052
## [1] 141 24 108 9 10 90 187 146 72 53 165 160 39 11 90 106 169 255
## [19] 43 175 11 89 88 15 75 39 41 2 31 212 177 49 216 88 17 35
## [37] 122 27 47 37 88 32 219 139 67 93 115 7 57 46 42 13 39 113
## [55] 3 109 12 149 183 139 34 4 78 37 69 6 35 65 183 59 217 55
## [73] 20 148 94 125 25 295 31 160 151 91 8 185 33 116 16 161 61 50
## [91] 136 74 41 49 29 136 158 140 89 96 63 94 158 26 100 317 96 60
## [109] 6 10 111 72 42 60 58 207 193 127 57 167 12 203 163 152 17 21
## [127] 16 66 111 58 60 49 116 62 59 99 35 11 183 8 8 163 12 36
## [145] 281 14 37 32 233 117 96 168 50 213 80 81 196 55 14 5 98 45
## [163] 63 30 170 76 73 52 153 49 39 71 32 11 76 7 155 93 123 79
## [181] 144 71 132 51 21 31 51 17 123 10 44 27 34 77 208 19 119 95
## [199] 120 138 80 22 11 39 91 31 70 131 111 77 47 135 14 99 33 27
## [217] 33 160 105 23 24 41 148 48 153 35 75 36 123 90 71 80 41 74
## [235] 123 68 17 68 62 161 93 100 144 115 82 19 93 0 82 19 18 32
## [253] 134 127 46 146 32 150 173 126 133 32 42 156 84 15 110 189 11 92
## [271] 252 55 154 82 60 100 104 144 28 55 165 110 88 62 25 121 33 10
## [289] 18 20 117 48 65 81 66 225 18 189 31 221 71 12 13 56 99 113
## [307] 24 17 19 23 153 53 67 79 293 113 12 64 43 26 73 47 42 28
## [325] 105 163 102 64 64 16 259 54 46 59 87 1 98 110 57 130 59 9
## [343] 203 221 57 180 66 114 32 93 32 70 23 58 45 8 19 107 60 30
## [361] 98 7 15 12 70 150 129 121 19 240 332 10 62 8 8 62 81 18
## [379] 135 128 2 116 34 29 4 118 21 154 137 54 89 109 80 6 34 184
## [397] 71 126 213 114 322 13 45 97 33 165 88 12 70 98 293 24 59 10
## [415] 170 53 79 70 56 5 53 174 45 66 33 105 26 78 159 123 104 105
## [433] 0 78 53 49 185 39 176 21 17 15 160 132 163 79 30 27 366 96
## [451] 75 104 118 122 14 5 97
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr17_7674291_7674858
## [1] 401 85 346 5 45 318 647 433 220 197 388 393 82 35 270 310 242 692
## [19] 136 461 37 326 216 25 99 97 154 7 49 347 603 109 585 256 85 76
## [37] 289 77 125 102 371 76 407 356 178 279 297 16 152 148 95 49 108 287
## [55] 17 423 27 193 491 369 124 9 158 134 175 23 153 134 478 114 501 141
## [73] 66 428 285 344 77 803 78 456 540 234 28 535 90 472 47 374 112 117
## [91] 524 301 167 105 81 391 364 308 249 357 236 337 432 80 314 462 272 168
## [109] 15 16 306 257 118 223 128 517 731 243 152 360 49 358 371 463 2 78
## [127] 20 145 303 173 152 153 268 248 154 264 83 18 504 17 21 398 33 88
## [145] 556 43 105 79 628 403 297 388 132 464 226 303 618 175 40 28 276 80
## [163] 107 85 691 144 210 113 349 131 53 156 143 18 187 11 598 268 391 126
## [181] 444 177 473 132 64 72 218 65 334 27 141 62 85 195 593 33 299 253
## [199] 409 242 164 41 30 55 205 87 230 342 282 130 205 210 34 338 46 32
## [217] 85 469 262 70 45 101 379 81 427 57 183 80 316 307 151 225 110 211
## [235] 484 199 21 130 174 449 258 155 355 361 239 54 192 1 209 93 40 90
## [253] 359 389 148 254 102 230 507 309 344 99 108 499 338 21 330 454 30 244
## [271] 807 137 316 302 211 172 388 297 100 239 371 269 131 207 64 194 41 36
## [289] 66 45 359 96 137 138 148 349 31 496 72 512 100 21 69 162 376 368
## [307] 66 44 47 52 424 166 207 224 507 325 52 190 119 105 155 131 80 83
## [325] 301 656 231 185 320 57 419 118 116 130 132 1 236 281 195 233 198 25
## [343] 479 711 141 680 142 380 140 249 38 207 42 146 119 31 42 143 197 141
## [361] 341 26 65 34 140 306 334 395 73 380 633 29 155 28 19 162 201 36
## [379] 371 308 4 289 92 73 5 307 84 321 385 117 203 299 320 15 44 342
## [397] 170 360 510 296 475 25 109 259 89 702 198 47 263 263 492 34 133 24
## [415] 460 76 288 253 204 17 129 299 150 172 95 272 41 253 506 385 316 328
## [433] 2 232 184 80 495 97 335 33 28 24 374 427 254 155 88 47 544 386
## [451] 152 306 350 265 49 8 261
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr17_7674291_7674858)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_ES <- (GeneSJ$chr17_7674291_7675052)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction Exon downstream (DE): chr17:7674291-7674858, donor splice site chr17:7674858
ggplot(GeneSJ, aes(sample_id,Normalized_CanonDE,color=GROUP)) +
geom_point(size=.8) +
labs(color='GROUP',
title = "Normalized Expression of TP53 chr17:7674291-7674858",
subtitle="Potential Donor Loss Effect on Canonical SJ",
y = "Normalized Expression")
Splicing alterations:
Canonical splice junction Exon downstream (DE): chr17:7674291-7674858, donor splice site chr17:7674858
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
Donor Loss:
Exon Skipping:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"]
## W = 0.90368, p-value = 2.288e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 10.52434
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonDE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 10.95890 11.65865
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonDE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.4345613 1.1343110
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:454] = -10.14, -8.0853, -7.0494, ..., 6.1423, 7.2388
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.5670330 0.7758242
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonDE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr17:7674291-7674858"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_ES[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_ES[SJCounts$GROUP == "WT"]
## W = 0.5814, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_ES[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.1276335
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_ES[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.0000000 0.1201923
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_ES - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.12763350 -0.00744119
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:272] = -0.12763, -0.11317, -0.11177, ..., 1.3908, 1.7265
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.3912088 0.7230769
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_ES")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1 - MUT_df$ECDF
MUT_df$Prediction <- "Exon Skipping"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr17:7674291-7675052"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"ASXL1_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="ASXL1" & found_variants$MutationKey_Hg38 == "chr20,32433787,C,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 2bp from the variant, chr20:32433786
Show all the splice junctions containing the positions between 32433780-32433789
colnames(GeneSJ)[grep("3243378",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Mutated samples vaf:
Variant found in 2 patients of the BeatAML (2 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"EP300_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="EP300" & found_variants$MutationKey_Hg38 == "chr22,41160652,T,C",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at the variant position, chr22:41160653
Show all the splice junctions containing the positions between 41160650-41160659
colnames(GeneSJ)[grep("4116065",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Mutated samples vaf:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"DNMT3A_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="DNMT3A" & found_variants$MutationKey_Hg38 == "chr2,25239130,C,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr2:25237006-25239139
Show all the splice junctions containing the position chr2:25237006-25239139
colnames(GeneSJ)[grep("25237006_25239139",colnames(GeneSJ))]
## [1] "chr2_25237006_25239139"
Found: chr2_25237006_25239139
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_25237006_25239139
## [1] 0 4 5 0 3 3 4 2 0 5 0 0 4 0 0 7 14 6 0 7 0 5 2 0 0
## [26] 3 3 0 6 4 0 2 0 3 0 0 6 2 2 0 0 7 12 0 0 6 0 0 0 0
## [51] 3 0 1 2 0 7 0 0 0 0 4 0 1 4 0 0 4 0 0 0 12 0 0 0 4
## [76] 0 0 23 0 7 0 1 2 9 2 0 3 3 0 0 0 4 0 11 6 0 2 0 0 5
## [101] 10 0 2 0 5 15 6 5 0 0 0 0 4 2 4 0 9 3 0 0 0 0 8 7 10
## [126] 0 3 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 13 2 0 4 6 3
## [151] 0 0 5 2 17 4 7 0 0 0 0 12 0 0 4 0 0 5 4 2 2 0 0 0 1
## [176] 0 3 3 2 0 2 0 7 0 2 2 0 0 3 0 0 0 3 0 4 3 4 5 5 5
## [201] 1 0 0 0 0 6 0 6 5 3 4 3 0 0 3 4 10 4 0 0 0 1 4 0 0
## [226] 5 3 0 0 0 7 8 0 0 7 0 0 7 2 4 0 0 6 7 5 8 0 0 0 0
## [251] 0 0 0 0 0 7 0 0 6 0 4 0 0 6 0 9 6 2 3 0 0 3 5 2 0
## [276] 5 0 0 3 4 0 0 5 4 3 2 0 2 0 0 0 0 7 0 0 0 3 14 0 5
## [301] 0 0 2 3 3 2 4 0 3 0 1 3 0 0 11 7 0 2 0 0 0 0 0 1 5
## [326] 0 0 2 0 2 5 0 4 2 4 7 8 4 0 0 0 0 1 11 0 0 3 0 7 2
## [351] 0 0 2 3 7 0 6 1 4 2 3 0 1 0 0 0 0 7 5 14 0 7 0 0 2
## [376] 0 2 0 5 2 0 0 0 4 0 8 0 3 0 0 7 3 0 0 0 9 0 5 0 0
## [401] 9 4 0 0 3 0 3 0 8 4 2 0 0 0 6 9 7 0 0 0 3 12 4 5 4
## [426] 6 8 6 0 4 4 3 0 5 0 2 0 0 7 6 2 0 5 0 6 0 1 0 6 0
## [451] 4 8 0 4 5 0 2
Samples with the SJ of interest:
table(GeneSJ$chr2_25237006_25239139>0)
##
## FALSE TRUE
## 226 231
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr2_25237006_25239139 > 0])
##
## MUT WT
## 1 230
Alternative SJ found in the mutated samples.
Exon upstream (UE): chr2:25239216-25240301
Exon downstream (DE): chr2:25237006-25239129; donor splice site chr2:25239129
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_25239216_25240301
## [1] 106 34 89 9 25 79 79 115 36 48 77 56 62 15 62 194 133 135
## [19] 24 99 23 72 62 30 167 37 34 7 47 136 49 76 55 50 32 81
## [37] 72 68 49 15 25 91 216 108 26 114 22 21 74 9 65 34 24 78
## [55] 29 96 18 36 129 123 108 38 32 56 23 44 122 57 179 137 267 123
## [73] 104 76 86 111 34 420 43 151 101 16 87 147 77 145 35 82 38 72
## [91] 115 147 24 118 116 35 58 82 49 96 101 37 99 81 62 392 133 44
## [109] 22 14 68 47 78 84 60 143 128 138 35 120 116 26 206 170 140 21
## [127] 50 28 109 90 37 87 35 107 14 29 91 16 154 18 15 160 27 40
## [145] 179 50 35 80 97 105 28 66 107 32 186 90 171 53 33 36 47 117
## [163] 25 92 116 39 81 79 66 30 31 107 49 45 74 50 192 94 56 65
## [181] 141 87 225 17 15 32 31 72 114 8 95 16 20 107 147 55 83 104
## [199] 140 137 57 24 14 65 32 62 76 81 27 89 94 81 37 29 56 35
## [217] 137 81 63 54 28 37 65 46 270 137 33 15 121 68 76 101 33 58
## [235] 131 46 49 94 27 140 67 139 163 84 101 258 100 11 84 38 48 87
## [253] 77 93 98 118 56 84 158 93 109 96 86 128 42 111 112 98 35 81
## [271] 89 72 137 59 40 71 50 95 41 81 117 99 73 54 28 58 85 11
## [289] 14 57 82 149 156 28 44 89 86 373 13 185 131 16 32 52 207 71
## [307] 47 30 32 46 46 44 23 105 273 83 11 28 50 79 20 51 75 71
## [325] 84 144 22 39 55 13 116 65 29 65 104 109 130 126 52 46 48 30
## [343] 90 198 89 100 94 81 108 126 81 87 37 59 80 18 172 69 113 40
## [361] 125 47 18 8 19 119 112 159 70 339 207 106 70 57 52 44 58 46
## [379] 43 123 0 85 22 42 28 129 61 51 131 78 107 115 58 51 67 131
## [397] 60 61 68 33 143 44 37 88 67 310 132 66 109 124 67 41 29 34
## [415] 73 88 142 69 47 32 46 124 130 48 48 107 184 86 125 127 89 91
## [433] 0 118 38 44 122 39 231 56 51 26 132 51 217 44 17 35 123 94
## [451] 54 117 90 64 32 11 67
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_25237006_25239129
## [1] 138 65 159 12 49 137 167 172 50 87 105 70 119 28 124 304 178 254
## [19] 38 127 32 138 93 44 232 71 75 20 92 198 83 129 75 85 47 129
## [37] 118 97 61 25 40 160 208 178 44 199 20 47 111 18 102 46 46 139
## [55] 58 131 26 42 219 180 165 70 45 105 39 58 208 85 281 194 441 196
## [73] 168 145 157 128 65 761 75 255 195 36 191 251 136 272 53 106 50 91
## [91] 186 253 36 188 180 61 91 128 72 167 156 63 179 131 99 460 191 63
## [109] 51 31 95 81 110 144 98 245 232 184 75 188 195 37 352 287 225 51
## [127] 89 55 181 168 68 140 65 163 25 56 140 27 238 35 24 215 62 58
## [145] 292 97 65 131 157 165 44 95 138 46 321 159 259 74 61 64 64 142
## [163] 60 152 238 54 120 122 104 55 50 171 85 66 136 68 343 146 93 113
## [181] 247 158 379 36 34 52 72 132 197 19 136 29 45 134 223 80 128 174
## [199] 277 205 95 39 20 86 41 94 109 158 47 110 156 105 50 68 75 56
## [217] 192 140 96 93 39 55 99 64 424 184 56 26 195 110 99 214 46 96
## [235] 183 89 86 184 36 285 144 174 246 161 157 409 167 20 138 54 70 126
## [253] 121 130 140 160 78 76 241 180 162 151 126 207 92 212 188 160 76 124
## [271] 150 147 215 105 56 97 98 167 59 129 174 166 107 96 30 78 160 13
## [289] 26 103 130 153 272 62 65 107 97 546 21 198 144 32 35 65 297 122
## [307] 69 44 53 97 94 71 67 143 342 108 14 53 102 116 38 89 159 98
## [325] 122 238 30 84 99 29 173 118 75 96 134 152 217 180 87 87 69 55
## [343] 125 292 142 174 123 126 175 185 108 166 57 105 136 34 269 95 197 79
## [361] 201 101 41 22 31 199 157 212 129 437 284 199 102 107 70 68 136 67
## [379] 80 228 1 154 38 81 49 233 72 87 213 135 164 191 104 88 115 223
## [397] 65 90 119 60 161 76 81 136 138 532 211 119 151 226 77 73 47 70
## [415] 123 112 214 112 95 47 75 143 259 60 71 211 278 171 205 219 158 124
## [433] 0 181 77 65 214 52 370 75 69 42 178 87 255 75 21 37 159 162
## [451] 91 198 176 111 53 11 116
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr2_25237006_25239129)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_DG <- (GeneSJ$chr2_25237006_25239139)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction:
Splicing alterations:
Canonical splice junction:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
Donor Loss:
Exon Skipping:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"]
## W = 0.92632, p-value = 3.542e-14
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 6.303546
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonDE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 6.47526
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonDE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.1717137
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:453] = -6.3035, -3.3822, -2.3648, ..., 4.8076, 7.5853
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.6052632
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonDE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr2:25237006-25239129"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_DG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_DG[SJCounts$GROUP == "WT"]
## W = 0.81071, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_DG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.1089652
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_DG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.1832621
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_DG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.07429686
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:225] = -0.10897, -0.056963, -0.055346, ..., 0.37789, 0.39103
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.7171053
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_DG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1 - MUT_df$ECDF
MUT_df$Prediction <- "Donor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr2:25237006-25239139"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"DNMT3A_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="DNMT3A" & found_variants$MutationKey_Hg38 == "chr2,25235792,T,C",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 5bp from the variant, chr2:25235796
Show all the splice junctions containing the positions between 25235790-25235799
colnames(GeneSJ)[grep("2523579",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr2:25235821-25236935
Show all the splice junctions containing the position chr2:25235821-25236935
colnames(GeneSJ)[grep("25235821_25236935",colnames(GeneSJ))]
## [1] "chr2_25235821_25236935"
Found: chr2_25235821_25236935
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_25235821_25236935
## [1] 6 0 8 0 0 2 10 2 4 5 4 3 9 0 6 16 15 13 3 1 1 4 0 4 10
## [26] 8 0 0 10 10 5 8 5 5 4 8 5 5 6 3 0 10 11 10 2 9 0 2 6 1
## [51] 7 5 0 0 3 10 0 3 12 5 6 4 2 7 0 0 14 6 12 4 11 0 5 0 5
## [76] 6 1 32 6 23 14 0 6 10 3 19 5 0 5 4 5 18 0 8 14 0 0 6 2 5
## [101] 16 3 8 5 2 29 8 3 2 0 5 5 9 9 0 5 7 4 3 4 7 0 12 13 5
## [126] 4 3 0 11 8 1 8 6 10 0 0 8 0 12 1 0 7 4 3 12 9 4 5 5 3
## [151] 0 4 7 2 22 7 17 7 5 0 1 8 4 8 27 0 3 8 3 3 3 8 4 9 10
## [176] 7 8 7 0 4 10 2 18 0 0 0 0 5 7 0 4 4 0 8 8 4 4 11 6 14
## [201] 8 3 0 4 3 0 4 6 2 5 9 6 0 0 4 0 7 8 2 4 0 0 2 1 22
## [226] 15 0 0 5 6 5 9 4 9 10 8 6 12 3 9 11 9 7 11 9 18 0 1 0 3
## [251] 7 8 10 6 6 5 4 4 11 7 4 4 4 15 6 7 12 7 9 6 3 7 6 0 2
## [276] 5 0 8 4 7 11 11 7 5 3 8 13 0 0 0 7 7 17 3 0 5 0 25 2 7
## [301] 8 1 0 0 9 6 0 7 2 2 0 0 2 9 13 6 1 2 8 10 1 1 7 2 6
## [326] 14 0 0 5 0 10 1 4 2 8 4 0 7 0 1 5 5 5 18 10 11 8 7 11 9
## [351] 10 13 2 0 8 0 17 8 14 0 4 0 3 1 3 12 3 7 8 25 11 14 6 0 2
## [376] 4 6 5 4 9 0 5 0 7 4 13 9 6 14 4 12 9 4 4 7 5 1 6 2 0
## [401] 0 8 2 7 5 15 7 8 10 11 3 5 5 0 4 5 13 3 0 1 9 7 16 6 2
## [426] 5 17 12 14 14 5 5 0 7 8 7 12 0 13 2 2 4 7 4 21 2 0 0 6 6
## [451] 6 4 4 3 3 1 5
Samples with the SJ of interest:
table(GeneSJ$chr2_25235821_25236935>0)
##
## FALSE TRUE
## 82 375
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr2_25235821_25236935 > 0])
##
## MUT WT
## 1 374
Alternative SJ found in the mutated samples.
Exon upstream (UE): chr2:25235826-25236935, aceptor splice site chr2:25238256
Exon downstream (DE): chr2:25234421-25235706, donor splice site chr2:25235706
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_25235826_25236935
## [1] 178 79 195 15 71 166 206 217 57 110 121 84 159 38 189 361 239 277
## [19] 56 145 39 189 137 50 270 106 83 29 102 249 90 146 99 116 60 162
## [37] 159 123 74 32 56 201 254 201 56 273 25 65 144 29 121 48 63 221
## [55] 84 166 30 49 296 216 207 88 51 134 47 76 270 112 328 253 510 241
## [73] 246 170 195 131 93 874 90 314 224 43 212 291 172 354 80 111 58 139
## [91] 265 321 51 220 227 66 86 165 93 214 214 81 192 148 133 567 213 84
## [109] 54 28 131 88 152 196 113 288 309 218 89 192 191 39 424 338 249 52
## [127] 95 59 204 215 89 175 87 209 34 75 159 38 289 33 31 267 77 67
## [145] 353 101 65 150 166 248 68 111 175 62 445 203 323 98 69 91 83 173
## [163] 69 165 316 71 124 148 115 70 43 224 136 75 163 88 473 154 122 116
## [181] 284 173 469 52 34 54 97 147 211 29 173 37 63 149 270 103 171 207
## [199] 305 285 130 34 28 109 45 110 141 190 60 115 199 132 63 85 83 70
## [217] 205 172 118 108 57 72 143 96 546 224 69 25 208 137 125 261 59 111
## [235] 235 100 87 223 59 336 168 230 318 207 184 457 198 23 153 69 87 160
## [253] 139 158 160 174 104 112 338 212 199 190 167 233 130 287 231 166 91 141
## [271] 183 157 265 130 77 124 122 201 77 198 219 185 126 116 46 87 186 25
## [289] 32 105 195 143 308 60 75 163 61 684 26 183 165 43 43 66 400 134
## [307] 92 57 74 113 99 91 88 174 387 169 24 58 117 150 30 86 185 110
## [325] 144 283 14 90 148 40 204 125 86 106 161 161 253 187 117 96 97 55
## [343] 86 366 169 183 147 170 245 233 126 193 59 106 166 34 319 112 211 96
## [361] 245 111 40 21 42 249 199 251 170 545 358 238 119 95 74 86 140 79
## [379] 93 237 1 189 45 98 72 263 95 100 258 148 222 243 138 99 129 291
## [397] 71 107 118 72 228 84 99 154 169 627 264 145 228 311 87 94 53 78
## [415] 152 142 272 154 111 72 95 197 279 93 102 246 315 213 242 229 219 135
## [433] 1 233 89 76 245 55 449 103 86 37 187 86 320 96 24 30 218 209
## [451] 106 230 227 127 82 14 141
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr2_25234421_25235706
## [1] 145 51 152 16 72 152 183 209 47 108 81 74 109 37 176 297 173 221
## [19] 31 92 39 140 94 37 173 56 73 31 79 166 90 110 67 74 66 100
## [37] 106 81 51 10 61 168 171 121 35 200 26 31 141 22 84 36 31 191
## [55] 56 158 18 45 242 146 163 73 36 144 41 62 199 119 212 161 348 161
## [73] 115 148 144 162 55 734 82 248 235 38 136 244 140 305 85 96 46 131
## [91] 203 331 45 203 216 34 62 115 52 203 170 74 138 113 110 443 156 36
## [109] 58 15 102 78 127 188 55 246 280 161 71 93 138 25 298 285 230 39
## [127] 67 57 174 117 57 164 46 185 23 63 133 31 227 7 20 180 60 44
## [145] 239 93 55 153 163 154 44 68 133 60 388 135 253 70 50 75 53 155
## [163] 51 147 315 44 93 91 79 47 31 164 106 50 122 71 432 98 75 76
## [181] 220 130 488 35 22 49 86 121 146 10 177 39 42 135 207 82 121 153
## [199] 252 187 73 36 22 85 35 64 122 165 61 72 182 98 40 88 53 57
## [217] 172 137 94 78 54 36 111 60 430 164 67 17 128 114 116 188 44 85
## [235] 212 112 69 175 51 243 134 182 222 178 179 387 135 13 112 66 58 116
## [253] 102 136 127 118 84 74 258 130 129 125 103 181 120 182 203 118 71 90
## [271] 116 132 141 157 57 73 79 140 71 193 190 112 88 78 33 66 145 24
## [289] 33 54 137 172 228 42 40 100 71 459 28 207 133 16 42 76 290 103
## [307] 51 44 45 71 70 66 58 126 276 89 13 47 80 137 29 56 141 77
## [325] 164 242 23 105 107 35 160 80 58 79 112 119 208 162 76 56 75 50
## [343] 157 274 133 195 109 147 211 149 68 166 61 91 107 34 212 95 187 92
## [361] 235 80 52 18 24 144 132 228 124 409 237 210 97 52 62 55 105 56
## [379] 89 159 0 142 35 72 50 235 100 66 167 110 140 198 160 80 83 167
## [397] 54 71 63 46 160 112 63 106 106 644 173 136 206 275 56 57 27 47
## [415] 88 83 284 132 94 91 106 132 178 51 69 166 243 197 180 200 233 100
## [433] 2 238 96 64 149 51 299 54 43 39 146 84 235 59 17 28 173 200
## [451] 80 144 184 82 58 12 131
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonUE <- (GeneSJ$chr2_25235826_25236935)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr2_25234421_25235706)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_AG <- (GeneSJ$chr2_25235821_25236935)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction:
Splicing alterations:
Canonical splice junction:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
Aceptor Loss:
Aceptor Gain:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_AG[SJCounts$GROUP == "WT"]
## W = 0.95965, p-value = 7.47e-10
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.2731979
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_AG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.6269592
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_AG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.3537613
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:369] = -0.2732, -0.21764, -0.21332, ..., 0.55227, 0.61613
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.9671053
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_AG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1 - MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr2:25235821-25236935"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"]
## W = 0.9343, p-value = 2.758e-13
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 7.623696
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonUE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 9.090909
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonUE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 1.467213
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:451] = -4.2983, -3.0101, -2.8542, ..., 4.8763, 8.3485
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.9407895
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonUE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr2:25235826-25236935"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"]
## W = 0.91514, p-value = 2.545e-15
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 5.868508
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonDE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 4.806688
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonDE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -1.06182
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:449] = -5.8685, -4.0264, -3.4704, ..., 6.2884, 8.4172
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.1337719
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonDE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr2:25234421-25235706"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"DNMT3A_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="DNMT3A" & found_variants$MutationKey_Hg38 == "chr2,25244214,G,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 2bp from the variant, chr2:25244215
Show all the splice junctions containing the positions between 25244210-25244219
colnames(GeneSJ)[grep("2524421",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Mutated samples vaf:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"DNMT3A_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="DNMT3A" & found_variants$MutationKey_Hg38 == "chr2,25240693,C,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 2bp from the variant, chr2:25240694
Show all the splice junctions containing the positions between 25240690-25240699
colnames(GeneSJ)[grep("2524069",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Mutated samples vaf:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"KMT2D_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="KMT2D" & found_variants$MutationKey_Hg38 == "chr12,49050056,G,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 2bp from the variant, chr12:49050057
Show all the splice junctions containing the positions between 49050050-49050059
colnames(GeneSJ)[grep("49050056",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Mutated samples vaf:
Variant found in 3 patients of the BeatAML (3 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"KMT2D_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="KMT2D" & found_variants$MutationKey_Hg38 == "chr12,49031255,G,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 2bp from the variant, chr12:49031256
Show all the splice junctions containing the positions between 49031250-49031259
colnames(GeneSJ)[grep("4903125",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr12:49031313
colnames(GeneSJ)[grep("4903131",colnames(GeneSJ))]
## [1] "chr12_49031313_49031501" "chr12_49031313_49031996"
## [3] "chr12_49031313_49032512" "chr12_49031313_49032561"
## [5] "chr12_49031313_49033723" "chr12_49031313_49038762"
## [7] "chr12_49031313_49041300"
Show all the splice junctions containing the position chr12:49031313-49034067
colnames(GeneSJ)[grep("49031313_49034",colnames(GeneSJ))]
## character(0)
t(GeneSJ[GeneSJ$GROUP == "MUT",c("sample_id",colnames(GeneSJ)[grep("49031313",colnames(GeneSJ))])])
## 63 127 434
## sample_id "BA2134R" "BA2302R" "BA3009R"
## chr12_49031313_49031501 "0" "0" "0"
## chr12_49031313_49031996 "0" "0" "0"
## chr12_49031313_49032512 "2" "0" "2"
## chr12_49031313_49032561 "0" "0" "0"
## chr12_49031313_49033723 "0" "0" "0"
## chr12_49031313_49038762 "0" "0" "0"
## chr12_49031313_49041300 "0" "0" "0"
Show all the splice junctions containing the position chr12:49031313-49032512
colnames(GeneSJ)[grep("49031313_49032512",colnames(GeneSJ))]
## [1] "chr12_49031313_49032512"
Found: chr12_49031313_49032512
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr12_49031313_49032512
## [1] 3 1 8 0 0 10 8 11 10 9 2 9 0 1 3 3 15 4 1 7 2 9 0 0 2
## [26] 0 4 0 0 6 22 1 6 5 1 1 2 0 0 3 13 9 10 5 0 10 2 1 3 2
## [51] 4 8 4 2 0 10 1 0 8 5 1 1 2 4 1 0 3 0 9 2 10 2 3 6 1
## [76] 5 2 3 1 2 7 7 2 2 1 7 4 1 0 5 10 6 1 0 4 4 2 3 5 5
## [101] 8 14 4 4 6 3 3 6 2 0 8 1 2 38 0 8 17 10 5 3 2 6 2 2 2
## [126] 5 0 5 2 6 1 3 2 4 1 4 3 0 2 0 0 8 0 0 4 8 0 0 2 6
## [151] 3 18 4 3 11 4 6 2 3 3 5 6 3 5 11 1 3 2 0 2 1 6 2 0 6
## [176] 3 2 10 2 0 8 1 0 3 4 1 4 0 5 0 8 3 2 2 16 0 4 3 19 8
## [201] 9 0 1 0 3 0 14 4 2 1 5 3 1 9 1 2 2 9 13 3 1 3 13 1 10
## [226] 5 1 5 0 2 1 6 0 9 9 2 1 0 1 7 3 0 12 10 4 1 4 0 4 4
## [251] 1 7 7 6 4 0 1 1 0 1 5 5 1 11 10 0 5 12 0 6 3 0 3 4 1
## [276] 2 5 1 4 8 7 4 3 4 0 1 1 0 6 1 3 0 2 0 0 6 0 5 2 18
## [301] 9 1 4 6 21 8 1 1 0 3 4 2 2 1 0 9 0 4 5 1 0 6 5 0 15
## [326] 8 0 6 4 1 3 1 2 3 2 0 8 8 0 7 4 0 13 9 9 9 4 5 1 6
## [351] 0 1 0 1 0 1 1 1 5 4 11 1 2 0 1 7 3 2 0 9 2 3 2 0 1
## [376] 5 1 0 7 6 0 1 1 7 7 0 1 7 3 5 2 14 8 0 0 2 10 9 9 1
## [401] 7 0 1 5 2 14 0 1 8 2 13 3 2 1 4 2 6 2 8 21 8 4 3 0 4
## [426] 8 1 6 5 27 6 4 0 2 1 2 3 0 4 3 1 4 10 4 4 1 1 2 4 9
## [451] 1 5 8 4 4 0 6
Samples with the SJ of interest:
table(GeneSJ$chr12_49031313_49032512>0)
##
## FALSE TRUE
## 79 378
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr12_49031313_49032512 > 0])
##
## MUT WT
## 2 376
Alternative SJ found in the mutated samples.
Exon upstream (UE): chr12:49033965-49034066; aceptor splice site chr12:49033965
Exon downstream (DE): chr12:49031034-49031174; donor splice site chr12:49031174
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr12_49033965_49034066
## [1] 251 127 541 10 219 303 337 292 165 499 140 151 173 225 129 176 248 213
## [19] 104 148 242 414 126 269 246 82 206 1 257 145 352 180 183 154 271 173
## [37] 142 232 117 262 303 166 167 144 22 315 21 194 251 98 134 274 165 352
## [55] 345 240 218 111 182 205 364 313 73 122 105 125 127 82 218 244 301 95
## [73] 219 210 209 260 409 256 145 144 244 140 369 140 195 221 110 63 178 171
## [91] 383 353 341 330 311 126 121 193 180 288 386 300 149 140 115 136 201 100
## [109] 146 158 222 162 227 398 62 223 397 242 205 114 350 139 180 206 331 317
## [127] 84 99 171 275 131 249 104 212 82 169 161 145 102 75 102 208 412 169
## [145] 155 392 72 259 158 340 128 507 358 146 390 77 168 231 97 144 169 216
## [163] 185 290 267 85 368 163 162 108 84 259 91 92 321 160 289 235 152 123
## [181] 315 126 318 149 281 83 336 98 147 395 275 351 273 140 259 160 283 188
## [199] 357 194 196 107 533 212 100 101 310 263 71 112 278 91 123 375 183 141
## [217] 70 254 158 59 186 154 272 195 308 134 100 82 83 354 161 407 195 103
## [235] 184 173 94 280 62 247 341 89 336 290 268 311 91 0 134 102 178 152
## [253] 184 243 90 129 317 213 253 90 163 290 128 227 317 164 366 167 115 190
## [271] 298 118 131 189 163 104 134 172 200 308 92 196 110 242 50 215 86 233
## [289] 252 109 284 252 164 38 68 155 194 227 84 298 213 534 97 163 198 321
## [307] 97 115 157 144 207 41 125 77 164 178 78 223 329 276 216 271 153 95
## [325] 267 243 126 205 69 149 209 56 188 166 134 71 350 171 62 139 165 213
## [343] 199 303 121 331 143 283 126 192 178 215 197 173 83 226 126 160 151 120
## [361] 379 137 284 192 102 107 178 191 329 107 121 207 139 105 458 174 231 395
## [379] 207 157 1 131 87 101 334 220 111 161 142 158 210 284 222 17 84 136
## [397] 325 268 112 217 240 395 138 192 110 310 54 249 322 221 192 110 113 96
## [415] 101 205 274 218 211 261 305 256 365 187 87 153 233 338 296 337 353 95
## [433] 0 144 93 132 131 53 90 217 167 200 267 206 157 202 6 182 188 187
## [451] 249 405 409 104 222 194 214
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr12_49031034_49031174
## [1] 203 54 416 9 128 269 240 143 134 345 142 128 165 82 94 138 214 137
## [19] 100 121 113 236 81 137 109 64 124 6 219 172 256 92 145 141 204 135
## [37] 96 84 94 110 247 123 160 110 24 156 26 113 138 70 91 118 109 242
## [55] 194 169 117 78 142 148 172 104 63 65 93 74 88 60 169 199 223 62
## [73] 106 145 124 178 195 165 73 99 158 116 163 116 107 161 60 39 135 100
## [91] 235 219 171 157 166 88 102 201 146 199 256 193 120 88 73 128 133 78
## [109] 70 84 150 85 116 319 60 159 278 160 138 113 143 93 138 130 186 142
## [127] 57 99 103 175 91 142 86 139 38 126 124 79 83 56 59 181 173 98
## [145] 141 136 61 130 138 260 97 338 213 97 265 52 136 148 71 106 125 153
## [163] 138 170 178 89 177 102 133 102 73 190 52 58 199 117 214 160 136 103
## [181] 198 107 211 77 91 54 233 69 141 214 169 164 73 132 213 90 181 172
## [199] 220 209 149 56 246 150 100 68 199 193 48 93 158 89 68 260 108 76
## [217] 55 196 110 41 148 93 159 130 220 132 107 75 81 240 125 234 104 96
## [235] 126 121 124 190 43 151 209 72 185 226 173 118 72 3 88 80 98 105
## [253] 109 220 81 146 189 183 166 85 158 184 87 198 193 88 236 116 95 129
## [271] 185 76 143 128 126 106 141 158 148 233 90 141 109 90 60 160 57 89
## [289] 178 85 163 121 123 46 63 134 52 208 90 189 164 267 73 119 161 188
## [307] 52 83 110 112 138 39 87 79 129 135 65 155 200 158 132 202 107 78
## [325] 211 224 107 155 75 143 189 48 75 116 95 51 257 112 44 123 104 150
## [343] 121 190 127 243 129 213 58 160 111 111 111 112 63 125 116 126 110 96
## [361] 254 80 124 115 72 109 111 118 169 81 120 115 125 89 165 101 174 212
## [379] 158 146 0 108 76 78 169 181 98 165 136 167 172 198 163 21 64 149
## [397] 261 194 107 126 184 233 96 134 64 231 91 137 204 135 152 92 135 52
## [415] 83 103 181 154 144 144 184 173 204 103 78 113 105 208 194 247 177 80
## [433] 2 89 55 87 87 42 99 101 97 138 211 148 136 144 10 132 162 174
## [451] 151 288 204 82 106 87 145
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonDE <- (GeneSJ$chr12_49031034_49031174)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_CanonUE <- (GeneSJ$chr12_49033965_49034066)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_AG <- (GeneSJ$chr12_49031313_49032512)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction:
Splicing alterations:
Canonical splice junction:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
Aceptor Loss:
Donor Loss:
Aceptor Gain:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"]
## W = 0.98286, p-value = 3.366e-05
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 2.346576
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonDE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 2.458057 1.932859 2.051164
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonDE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.1114811 -0.4137173 -0.2954120
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:452] = -2.3466, -1.4753, -1.2799, ..., 1.4057, 1.6941
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.6013216 0.1696035 0.2422907
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonDE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr12:49031034-49031174"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"]
## W = 0.91571, p-value = 3.157e-15
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonUE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 3.337412
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonUE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 2.848225 2.848423 3.318737
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonUE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.48918776 -0.48898930 -0.01867546
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:451] = -3.3374, -3.0627, -2.6438, ..., 1.2966, 1.3287
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.1519824 0.1519824 0.4713656
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonUE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr12:49033965-49034066"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_AG[SJCounts$GROUP == "WT"]
## W = 0.89952, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_AG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.06900696
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_AG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.07803355 0.00000000 0.04609357
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_AG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.009026599 -0.069006955 -0.022913385
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:374] = -0.069007, -0.061605, -0.06126, ..., 0.27096, 0.2882
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.6167401 0.1718062 0.4317181
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_AG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1 - MUT_df$ECDF
MUT_df$Prediction <- "Aceptor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr12:49031313-49032512"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 1 patient of the BeatAML (1 sample)
The splicing alterations being assessed are:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"TET2_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="TET2" & found_variants$MutationKey_Hg38 == "chr4,105259774,G,A",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted at 38bp from the variant, chr4:105259811
Show all the splice junctions containing the positions between 105259810-105259819
colnames(GeneSJ)[grep("10525981",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chr4:105259708-105261758
Show all the splice junctions containing the position chr4:105259708-105261758
colnames(GeneSJ)[grep("105259708_105261758",colnames(GeneSJ))]
## [1] "chr4_105259708_105261758"
Found: chr4_105259708_105261758
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr4_105259708_105261758
## [1] 1 1 0 0 0 0 0 1 0 1 1 0 0 0 0 2 0 1 0 0 0 0 0 0 0
## [26] 0 0 0 0 0 1 1 0 0 2 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0
## [51] 0 0 0 0 0 0 0 0 0 0 1 0 2 0 3 0 0 0 0 0 0 0 0 3 0
## [76] 0 0 0 0 0 0 0 2 2 0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 1
## [101] 0 0 0 0 2 1 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 1 0 0 0
## [126] 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 2 0 0 0
## [151] 0 0 0 0 2 2 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0
## [176] 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0
## [201] 1 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 1 1 0 0 0 0 0
## [226] 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 3 0 0 4 0 0 0 0 0 0
## [251] 0 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 2 0 0 0 0 0 0 0
## [276] 0 0 0 31 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 2 0 0 0 1
## [301] 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 2 1 0 1
## [326] 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1
## [351] 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0
## [376] 1 0 0 3 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 1 0 2
## [401] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 1 0 0 0 0
## [426] 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 4 0 0 0 0 0 0 0
## [451] 0 0 0 0 0 0 0
Samples with the SJ of interest:
table(GeneSJ$chr4_105259708_105261758>0)
##
## FALSE TRUE
## 374 83
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr4_105259708_105261758 > 0])
##
## MUT WT
## 1 82
Alternative SJ found in the mutated samples.
Search: chr4:105243779-105261758
Show all the splice junctions containing the position chr4:105243779-105261758
colnames(GeneSJ)[grep("105243779_105261758",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Exon 6-7 chr4:105243779-105259618
Exon 7-8 chr4:105259770-105261758; donor splice site chr4:105259770
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr4_105243779_105259618
## [1] 46 13 84 1 13 1 20 33 34 45 66 44 19 8 14 36 101 61
## [19] 15 37 3 31 15 49 9 7 54 1 9 71 60 33 60 54 41 17
## [37] 14 16 17 7 59 23 64 23 8 29 6 15 27 39 9 20 68 28
## [55] 72 32 12 10 100 45 29 3 32 3 73 18 19 9 22 41 29 5
## [73] 4 63 19 61 1 54 11 45 24 50 8 52 9 19 3 20 47 29
## [91] 84 30 30 22 10 109 45 71 71 52 51 41 23 6 43 55 9 45
## [109] 16 28 35 56 8 19 17 55 33 19 57 22 8 28 44 48 50 45
## [127] 5 55 27 10 63 38 46 25 40 47 18 21 12 22 16 52 13 5
## [145] 15 24 13 17 56 38 74 107 28 27 43 19 76 30 7 25 66 79
## [163] 45 17 18 44 4 7 29 20 50 43 17 16 29 11 39 34 53 13
## [181] 25 21 16 51 15 11 39 1 59 13 7 17 14 24 60 15 80 51
## [199] 46 63 39 6 16 22 72 10 43 51 51 31 9 33 9 106 19 9
## [217] 7 55 30 24 13 17 54 23 27 12 72 15 38 62 54 12 16 20
## [235] 33 37 19 13 73 21 50 13 37 57 71 7 17 3 21 11 12 30
## [253] 58 101 14 50 57 88 35 14 49 21 35 43 41 16 44 18 9 29
## [271] 39 7 53 34 42 39 26 39 51 12 35 38 30 9 9 72 7 5
## [289] 33 13 23 32 52 24 21 74 41 46 6 74 18 10 17 32 14 64
## [307] 36 12 13 12 58 9 68 10 64 48 9 39 27 24 29 99 25 3
## [325] 66 15 101 59 16 43 54 25 9 16 13 21 24 20 15 30 27 17
## [343] 76 30 14 22 34 43 5 33 21 20 63 29 3 7 4 28 26 4
## [361] 20 6 42 3 10 37 35 20 12 42 29 16 54 13 23 44 47 5
## [379] 55 16 1 28 12 18 46 54 3 34 26 50 19 29 49 1 3 31
## [397] 33 38 18 66 55 25 34 34 13 20 7 14 35 29 76 14 81 9
## [415] 23 23 31 32 59 1 23 29 4 21 27 12 7 13 28 91 33 28
## [433] 0 17 15 37 23 16 15 32 6 17 106 24 17 77 15 36 56 45
## [451] 62 52 41 6 8 2 55
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr4_105259770_105261758
## [1] 114 14 222 2 68 7 76 96 71 199 139 69 53 32 73 70 175 136
## [19] 45 67 28 58 36 127 22 17 146 3 58 123 171 98 99 84 205 60
## [37] 65 61 41 53 160 68 95 39 26 94 15 58 65 94 23 55 133 92
## [55] 189 119 16 44 263 89 89 47 71 42 122 44 50 12 50 100 66 31
## [73] 36 157 30 178 18 209 36 59 74 97 68 112 48 62 17 26 93 66
## [91] 239 70 122 60 85 226 110 146 126 113 151 146 58 31 108 97 33 74
## [109] 57 67 169 135 32 86 75 106 132 52 152 58 44 43 87 138 143 153
## [127] 18 100 90 35 168 73 75 150 77 90 43 82 40 67 35 128 43 19
## [145] 41 64 38 43 99 60 175 161 111 55 210 53 208 81 23 107 140 198
## [163] 102 62 116 82 15 33 55 33 132 116 64 35 82 36 99 68 103 31
## [181] 76 64 55 62 60 21 165 5 73 65 59 62 51 57 127 36 148 108
## [199] 145 102 82 26 50 56 86 45 160 100 109 79 42 49 29 135 40 35
## [217] 8 141 66 57 40 39 165 44 73 43 142 37 50 150 55 39 62 70
## [235] 97 92 62 33 115 65 106 61 72 153 208 39 53 4 69 30 34 74
## [253] 98 198 33 139 162 189 92 17 103 71 81 86 125 76 125 48 21 78
## [271] 88 34 102 125 151 54 103 85 71 55 62 82 116 57 23 145 27 50
## [289] 152 59 103 114 116 50 39 110 67 69 16 138 39 37 78 77 36 186
## [307] 70 35 24 32 128 20 170 18 98 59 39 98 85 90 69 171 110 12
## [325] 130 65 183 174 68 139 106 67 29 29 54 42 76 38 54 83 86 88
## [343] 137 126 26 97 92 88 31 78 88 68 135 78 19 23 15 59 114 69
## [361] 67 31 100 31 36 70 67 45 95 87 55 85 81 62 120 151 152 52
## [379] 120 71 0 69 36 66 92 125 20 73 50 97 34 67 183 0 39 62
## [397] 0 87 45 214 106 167 75 59 47 58 15 59 119 46 108 42 92 31
## [415] 39 77 81 79 148 7 113 46 67 50 62 43 36 80 78 163 142 41
## [433] 0 67 74 93 49 30 22 86 29 47 214 87 39 122 22 67 103 96
## [451] 118 133 120 25 43 28 115
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonEx7_8 <- (GeneSJ$chr4_105259770_105261758)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_Ex7DG <- (GeneSJ$chr4_105259708_105261758)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction:
Splicing alterations:
Canonical splice junction:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
Donor Loss:
Donor Gain:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx7_8[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx7_8[SJCounts$GROUP == "WT"]
## W = 0.91829, p-value = 5.209e-15
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx7_8[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 10.47262
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx7_8[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 7.802198
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx7_8 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -2.670419
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:449] = -10.473, -9.7634, -7.9903, ..., 6.9692, 20.297
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.1469298
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx7_8")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr4:105259770-105261758"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_Ex7DG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_Ex7DG[SJCounts$GROUP == "WT"]
## W = 0.44616, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_Ex7DG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.03142559
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_Ex7DG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 3.406593
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_Ex7DG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 3.375168
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:81] = -0.031426, 0.024881, 0.033383, ..., 0.51502, 0.57833
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 1
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_Ex7DG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1 - MUT_df$ECDF
MUT_df$Prediction <- "Donor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr4:105259708-105261758"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 3 patients of the BeatAML (3 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"PTPN11_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="PTPN11" & found_variants$MutationKey_Hg38 == "chr12,112489084,G,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: chr12:112489079-112502143
colnames(GeneSJ)[grep("112489079",colnames(GeneSJ))]
## [1] "chr12_112489079_112502143" "chr12_112489079_112502148"
## [3] "chr12_112489079_112504694" "chr12_112489079_112505824"
Show all the splice junctions containing the position chr12:112489079-112502143
colnames(GeneSJ)[grep("112489079_112502143",colnames(GeneSJ))]
## [1] "chr12_112489079_112502143"
Found: chr12_112489079_112502143
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr12_112489079_112502143
## [1] 0 0 0 1 0 0 0 1 0 2 0 0 1 0 0 1 1 1 3 0 2 0 1 3 0
## [26] 1 0 0 3 0 0 5 0 0 1 3 0 1 0 1 0 0 2 2 0 2 0 1 0 0
## [51] 0 0 0 0 2 1 0 1 2 0 1 1 2 0 0 0 0 0 0 0 0 0 5 0 0
## [76] 0 0 1 0 1 2 0 2 0 0 1 0 0 0 1 3 2 1 1 2 1 0 0 0 0
## [101] 0 0 0 0 0 1 1 0 0 0 0 0 3 1 0 0 0 1 2 0 0 0 0 2 1
## [126] 1 2 1 1 3 1 1 0 0 0 2 1 0 0 0 0 1 0 0 0 0 0 1 1 1
## [151] 0 0 0 0 0 2 0 0 1 0 0 1 1 0 1 0 0 3 0 0 0 0 2 0 0
## [176] 0 1 0 0 0 0 0 3 2 1 1 2 0 0 4 1 3 0 0 0 2 0 5 0 0
## [201] 0 1 0 1 0 0 0 0 0 3 2 1 1 1 0 0 0 0 0 1 2 0 1 1 2
## [226] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 3 0 1 1 0 2 0 0 0
## [251] 0 2 0 0 0 1 0 0 1 0 0 10 0 0 0 1 1 3 2 4 5 1 0 0 0
## [276] 0 0 0 0 1 2 0 2 1 2 0 0 5 1 0 2 2 0 0 0 0 0 0 0 0
## [301] 0 1 0 0 1 0 2 0 0 0 1 0 0 0 0 0 0 0 1 2 0 1 2 0 2
## [326] 0 0 0 1 0 1 0 1 0 1 0 0 0 2 3 0 0 2 0 0 0 0 0 2 0
## [351] 0 2 1 3 2 0 1 0 0 2 0 2 2 0 0 0 1 5 1 2 0 0 0 2 0
## [376] 1 0 1 0 1 0 0 0 3 0 0 0 0 0 0 0 0 2 0 3 0 0 4 3 0
## [401] 0 0 1 0 3 0 1 2 3 0 0 0 0 0 0 0 0 2 0 3 0 1 1 3 0
## [426] 0 0 2 0 0 0 0 0 2 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0
## [451] 0 0 1 0 0 1 0
Samples with the SJ of interest:
table(GeneSJ$chr12_112489079_112502143>0)
##
## FALSE TRUE
## 282 175
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chr12_112489079_112502143 > 0])
##
## MUT WT
## 2 173
Alternative SJ found in the mutated samples.
Exon 12-13: chr12:112488511-112489023
Exon 13-14: chr12:112489176-112502143; splice site chr12:112489176
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr12_112488511_112489023
## [1] 156 53 242 3 96 236 300 249 93 150 247 148 119 163 191 256 183 363
## [19] 170 138 207 224 102 175 136 99 121 1 101 216 307 107 164 134 95 109
## [37] 113 123 89 237 263 222 163 202 58 220 70 81 111 75 84 85 78 275
## [55] 318 282 56 253 126 195 195 70 101 185 88 116 241 110 133 196 339 88
## [73] 259 175 133 206 196 353 92 240 266 103 203 165 124 197 62 144 96 129
## [91] 362 196 277 184 105 196 148 133 92 325 118 205 260 147 200 206 184 126
## [109] 137 75 209 110 170 368 73 248 388 187 74 132 124 139 153 209 398 126
## [127] 99 115 198 127 132 129 80 227 113 111 98 79 159 37 58 210 210 85
## [145] 145 171 58 134 232 191 133 448 157 135 211 136 285 204 81 187 159 192
## [163] 90 150 317 80 119 76 133 115 106 252 136 81 112 213 471 242 98 184
## [181] 183 200 415 94 107 98 157 45 148 169 203 133 148 67 258 55 138 170
## [199] 256 132 160 99 192 238 127 159 317 244 136 151 131 130 48 204 122 103
## [217] 68 216 149 133 86 113 132 192 220 149 126 77 138 203 164 125 102 57
## [235] 205 134 47 217 83 198 190 201 218 207 143 137 152 2 161 113 158 83
## [253] 164 155 142 194 112 140 278 76 143 162 156 186 168 159 195 159 105 261
## [271] 391 113 160 228 179 131 104 121 135 192 173 145 103 206 49 213 52 102
## [289] 71 91 147 447 204 101 117 146 296 149 94 238 299 298 93 163 190 217
## [307] 75 82 96 75 177 31 159 80 167 88 86 78 219 133 83 133 94 64
## [325] 156 245 153 107 208 48 184 137 145 105 155 178 230 190 104 84 89 217
## [343] 291 349 189 199 127 240 121 176 73 178 248 114 99 103 89 132 166 129
## [361] 317 92 180 78 107 133 116 135 182 180 191 190 175 82 129 156 164 280
## [379] 199 210 2 79 51 71 45 191 167 164 163 95 206 170 150 11 118 153
## [397] 83 173 115 167 147 97 72 153 79 456 63 159 279 171 230 60 102 38
## [415] 189 114 243 154 161 167 142 191 222 174 114 131 110 233 169 208 180 105
## [433] 1 166 136 115 143 122 171 65 91 197 240 84 207 70 9 94 216 233
## [451] 118 229 201 120 95 65 242
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chr12_112489176_112502143
## [1] 274 65 339 1 120 419 538 394 139 260 375 238 196 219 323 412 222 574
## [19] 231 206 163 389 144 174 162 154 193 3 125 249 467 181 235 143 118 160
## [37] 178 226 148 323 446 356 260 268 67 353 27 96 174 103 113 128 86 499
## [55] 419 424 80 281 239 292 236 119 132 242 167 184 370 209 215 301 473 157
## [73] 296 347 159 354 277 605 130 374 411 153 319 283 183 287 84 210 126 204
## [91] 585 317 426 329 149 333 175 182 128 505 193 310 322 191 355 263 301 198
## [109] 159 113 312 212 297 504 99 399 580 249 162 185 136 161 207 322 638 221
## [127] 136 179 327 212 222 214 127 303 184 148 139 115 233 46 77 315 286 136
## [145] 186 210 105 200 383 295 197 623 261 218 324 258 421 300 110 288 223 237
## [163] 111 305 361 105 173 128 191 183 110 454 198 123 173 279 759 311 164 217
## [181] 285 252 607 123 197 178 240 66 213 248 303 162 215 103 351 82 198 219
## [199] 360 152 217 165 319 232 200 239 413 420 223 188 218 171 74 345 128 108
## [217] 98 359 222 170 95 135 235 268 359 186 231 126 220 318 211 199 170 96
## [235] 299 187 50 296 105 302 248 259 326 355 243 218 154 3 327 145 183 140
## [253] 234 310 185 237 188 189 392 102 228 209 197 324 269 243 377 280 128 420
## [271] 448 144 211 333 296 177 194 154 251 271 347 174 181 294 69 259 67 177
## [289] 84 113 192 295 288 117 190 195 233 268 121 153 419 398 130 206 361 318
## [307] 130 93 126 100 284 63 236 157 233 156 137 127 364 183 150 212 137 76
## [325] 276 372 153 179 341 81 220 196 206 172 204 245 391 307 181 125 155 309
## [343] 201 616 229 262 185 403 178 215 111 346 272 182 124 115 122 173 283 228
## [361] 428 105 233 114 88 221 203 235 270 226 186 353 208 106 151 286 206 318
## [379] 343 273 1 124 96 87 66 313 317 207 285 139 307 265 253 2 157 195
## [397] 104 284 208 225 240 149 119 205 92 708 103 231 391 273 346 63 125 54
## [415] 252 115 296 278 223 264 217 216 277 248 185 206 114 368 233 407 307 147
## [433] 0 271 217 168 189 141 212 91 114 241 289 160 303 122 14 120 265 358
## [451] 208 289 301 164 135 70 335
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonEx13_14 <- (GeneSJ$chr12_112489176_112502143)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_Ex13DG <- (GeneSJ$chr12_112489079_112502143)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction:
Splicing alterations:
Canonical splice junction:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
Donor Loss:
Donor Gain:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonEx13_14[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonEx13_14[SJCounts$GROUP == "WT"]
## W = 0.91555, p-value = 2.918e-15
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonEx13_14[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 8.050041
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonEx13_14[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 7.403471 7.817386
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonEx13_14 - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.6465693 -0.2326549
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:452] = -8.05, -7.3458, -6.9631, ..., 4.0852, 11.95
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.3296703 0.4483516
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonEx13_14")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chr12:112489176-112502143"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_Ex13DG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_Ex13DG[SJCounts$GROUP == "WT"]
## W = 0.53459, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_Ex13DG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.02726744
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_Ex13DG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.3542331 0.1876173
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_Ex13DG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.3269656 0.1603498
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:172] = -0.027267, -0.013522, -0.013448, ..., 0.2591, 0.67696
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.9978022 0.9846154
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_Ex13DG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1 - MUT_df$ECDF
MUT_df$Prediction <- "Donor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chr12:112489079-112502143"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Variant found in 2 patients of the BeatAML (2 samples)
The splicing alterations being assessed are:
Variant information:
Load the extracted splice junctions of the gene harboring the mutation.
extractedSJ_path <- paste0(extractedSJ_dir_in,"KDM6A_UM_annotSJ.tsv")
GeneSJ <- read.delim(extractedSJ_path, sep ="\t")
Set the sample’s group: Mutated (MUT) or No Mutated (WT)
samples_df <- found_variants[found_variants$Gene=="KDM6A" & found_variants$MutationKey_Hg38 == "chrX,45062737,C,T",]
cases <- samples_df$RNA_Sample[samples_df$Validable == "Validable"]
GeneSJ$GROUP <- ifelse(GeneSJ$sample_id %in% cases , "MUT", "WT")
Search for the splice junctions of interest in the extracted splice junctions of the gene by position (chr_SJstart_SJend).
Search: predicted a 2bp from the variant, chrX:45062740
Show all the splice junctions containing the positions between 45062740-45062749. Found canonical donor: chrX:45062749
colnames(GeneSJ)[grep("4506274",colnames(GeneSJ))]
## [1] "chrX_45062749_45063379" "chrX_45062749_45063392" "chrX_45062749_45063421"
## [4] "chrX_45062749_45063434"
Show all the splice junctions containing the positions between 45062730-45062739.
colnames(GeneSJ)[grep("4506273",colnames(GeneSJ))]
## character(0)
Alternative SJ not found in the splice junction collection.
Search: chrX:45062720-45063421
colnames(GeneSJ)[grep("4506272",colnames(GeneSJ))]
## [1] "chrX_45062720_45063392" "chrX_45062720_45063421" "chrX_45062720_45063434"
## [4] "chrX_45062724_45063421"
Show all the splice junctions containing the position chrX:45062720-45063421
colnames(GeneSJ)[grep("45062720_45063421",colnames(GeneSJ))]
## [1] "chrX_45062720_45063421"
Found: chrX_45062720_45063421
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chrX_45062720_45063421
## [1] 0 0 0 0 0 0 0 1 1 0 0 0 0 2 0 2 0 0 0 0 0 0 1 5 0 0 0 0 2 0 0 1 0 0 2 2 2
## [38] 1 0 4 0 0 0 0 0 3 0 0 0 2 2 4 1 3 5 0 1 3 1 0 1 3 0 1 0 0 0 1 1 0 0 1 2 0
## [75] 0 0 7 3 0 0 1 0 3 2 2 1 0 1 2 0 3 5 5 4 0 0 0 0 1 5 5 0 1 2 2 0 2 2 2 2 0
## [112] 1 0 2 0 0 0 0 1 3 0 7 0 5 2 5 1 1 0 0 3 0 0 2 0 0 0 1 1 3 1 0 1 0 0 4 0 3
## [149] 0 0 1 1 0 0 2 0 2 1 0 5 0 0 0 2 2 0 1 1 0 1 6 4 0 0 1 0 2 4 2 0 0 0 1 0 0
## [186] 0 1 0 0 0 3 3 3 0 1 1 0 1 0 0 0 0 1 0 0 1 0 1 1 0 1 0 1 1 5 1 4 0 0 0 0 0
## [223] 4 1 3 3 6 0 0 0 0 0 4 2 2 3 0 2 0 0 0 1 1 2 0 0 2 0 1 3 3 5 1 1 2 4 4 0 0
## [260] 0 0 0 2 1 0 0 0 1 1 0 0 1 2 0 2 2 0 0 0 0 0 0 1 0 3 0 2 2 2 0 0 0 1 2 2 0
## [297] 0 0 0 0 2 0 3 0 0 0 4 2 0 2 0 0 6 0 8 0 1 0 3 3 0 1 3 0 0 0 0 2 0 1 0 2 1
## [334] 2 1 0 0 1 4 2 5 1 0 2 0 0 1 0 0 3 3 1 3 0 0 0 0 0 0 0 1 0 2 0 2 2 2 2 3 1
## [371] 1 4 1 0 1 3 0 0 1 3 0 2 0 0 0 2 1 0 0 2 1 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0
## [408] 3 0 4 0 0 1 2 0 0 2 0 1 0 0 2 0 1 0 3 1 1 0 0 0 3 0 2 0 0 3 1 1 2 0 0 0 0
## [445] 0 1 0 1 4 0 1 1 0 0 1 0 0
Samples with the SJ of interest:
table(GeneSJ$chrX_45062720_45063421>0)
##
## FALSE TRUE
## 235 222
Groups of the samples having the alternative splice junction:
table(GeneSJ$GROUP[GeneSJ$chrX_45062720_45063421 > 0])
##
## MUT WT
## 2 220
Alternative SJ found in the mutated samples.
Exon upstream (UE): chrX:45061420-45062646
Exon downstream (DE): chrX:45062749-45063421; donor splice site: chrX:45062749
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chrX_45061420_45062646
## [1] 40 35 93 5 101 34 52 61 18 120 41 28 95 116 28 48 44 72
## [19] 124 12 120 60 49 125 96 52 76 0 96 71 47 130 28 37 93 61
## [37] 43 81 43 231 62 40 47 41 29 140 13 63 74 31 94 307 59 43
## [55] 173 34 150 73 52 52 160 200 22 127 47 67 36 54 34 61 56 53
## [73] 71 72 39 152 364 107 134 7 50 19 135 61 82 37 74 38 49 48
## [91] 161 108 287 223 113 63 41 45 19 73 201 48 33 31 24 50 69 31
## [109] 88 93 105 49 75 61 50 63 84 25 42 45 85 63 43 102 58 163
## [127] 124 18 45 67 76 81 22 119 39 16 77 116 58 67 42 46 309 45
## [145] 33 219 17 128 132 49 29 205 214 72 57 60 101 96 47 49 45 159
## [163] 27 71 76 29 87 57 15 66 74 107 44 62 41 81 65 80 56 158
## [181] 65 40 45 46 99 47 77 37 61 198 67 156 208 40 26 110 40 44
## [199] 42 15 50 52 138 80 22 61 81 66 92 44 49 56 85 61 136 56
## [217] 49 85 29 85 144 97 77 184 81 53 126 59 30 76 32 187 164 34
## [235] 43 68 72 62 25 72 52 56 44 55 66 239 115 0 45 76 184 127
## [253] 76 63 35 94 91 49 31 12 35 36 64 67 48 81 48 40 66 65
## [271] 138 28 49 57 101 29 32 27 132 41 30 39 67 90 104 36 45 103
## [289] 55 54 75 109 56 35 27 28 40 53 67 82 71 144 98 52 61 105
## [307] 33 78 69 85 24 6 93 21 85 33 54 49 127 117 96 96 157 44
## [325] 43 48 39 33 37 46 62 54 70 59 36 45 36 26 38 17 47 94
## [343] 46 156 14 66 44 40 72 37 50 103 75 94 46 96 50 54 73 59
## [361] 110 102 84 124 55 33 37 73 115 38 57 90 42 92 288 78 39 101
## [379] 91 39 0 54 40 58 119 72 87 34 45 38 128 60 66 13 41 31
## [397] 34 24 30 44 54 211 40 24 77 33 15 138 107 53 43 36 40 31
## [415] 34 96 78 63 100 130 61 83 146 146 32 103 72 44 85 81 80 36
## [433] 0 100 34 93 42 92 35 104 55 141 43 58 67 41 14 130 33 56
## [451] 27 40 62 49 97 110 88
Reads of all the AML samples (mutated and no mutated) for the splice junction:
GeneSJ$chrX_45062749_45063421
## [1] 39 38 85 3 49 40 32 38 13 63 45 29 84 86 34 41 39 57
## [19] 100 14 80 64 40 108 93 39 67 1 83 53 26 109 15 32 42 54
## [37] 41 55 31 186 36 30 46 33 37 109 15 63 61 19 72 212 39 22
## [55] 136 24 116 57 55 79 120 191 29 61 47 46 19 50 29 49 41 53
## [73] 81 50 27 105 265 115 103 7 42 20 126 60 76 18 46 23 62 64
## [91] 112 72 222 171 76 60 35 35 19 66 155 26 40 30 26 61 58 18
## [109] 71 78 90 41 42 32 42 70 51 16 32 35 64 52 43 93 52 149
## [127] 110 19 53 47 63 68 21 65 54 23 74 88 63 57 32 50 225 38
## [145] 32 190 19 95 112 38 31 173 168 58 49 40 95 75 31 20 50 155
## [163] 22 61 38 33 62 54 20 44 67 88 22 70 52 69 41 74 43 137
## [181] 43 45 24 32 59 46 54 31 40 162 36 78 152 38 24 104 47 39
## [199] 50 25 36 48 103 71 23 35 49 41 75 53 35 54 54 43 122 50
## [217] 47 56 22 71 101 72 69 164 60 55 98 49 30 44 25 143 114 21
## [235] 30 56 54 77 15 57 66 49 37 28 65 185 112 0 28 42 153 99
## [253] 69 59 32 88 60 61 24 13 26 29 52 51 25 68 52 29 58 49
## [271] 110 27 38 29 80 25 27 25 127 31 25 31 58 74 84 21 42 62
## [289] 47 49 56 97 40 42 16 25 45 42 69 59 76 102 85 54 59 83
## [307] 32 68 78 67 25 2 66 13 96 26 46 44 106 98 87 101 103 39
## [325] 46 33 44 44 35 36 50 42 59 42 37 32 36 27 38 14 43 66
## [343] 34 117 18 40 33 30 35 38 47 72 77 79 45 71 35 52 52 25
## [361] 61 97 90 82 69 34 41 62 70 27 64 102 33 73 206 72 27 99
## [379] 65 39 0 38 28 46 96 68 65 32 36 39 120 63 43 14 56 40
## [397] 33 26 28 48 54 128 33 38 64 26 17 113 66 56 37 27 29 36
## [415] 34 89 50 35 60 74 45 73 141 121 34 81 84 26 61 63 57 34
## [433] 0 102 17 89 35 87 35 93 37 127 56 50 62 42 7 145 32 29
## [451] 42 37 46 48 55 112 55
Count the reads of all the splice junctions of the gene harboring the variant:
GeneSJ$rowSum_SJtotal <- rowSums(GeneSJ[,grep("chr", names(GeneSJ))])
Normalization of the expression by the total read counts of all the splice junctions of the gene:
GeneSJ$Normalized_CanonDE <- (GeneSJ$chrX_45062749_45063421)/GeneSJ$rowSum_SJtotal*100
GeneSJ$Normalized_DG <- (GeneSJ$chrX_45062720_45063421)/GeneSJ$rowSum_SJtotal*100
Download the normalized values for the assessed splice junctions of all the AML samples:
Mutated samples vaf:
Canonical splice junction:
Splicing alterations:
Canonical splice junction:
Splicing alteration:
Violin Plots for the alternative splice junctions interrogated:
Donor Loss:
Donor Gain:
SJCounts <- GeneSJ
Normality Test:
shapiro.test(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"]
## W = 0.99066, p-value = 0.005593
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_CanonDE[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 2.683123
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_CanonDE[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 2.614379 3.139013
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_CanonDE - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] -0.06874374 0.45589063
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:448] = -2.6831, -2.2899, -2.0949, ..., 1.9441, 2.1063
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.4461538 0.7208791
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_CanonDE")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- MUT_df$ECDF
MUT_df$Prediction <- "Donor Loss"
MUT_df$splice_junction_status <- "CanonicalSJ"
MUT_df$splice_junction_position <- "chrX:45062749-45063421"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Normality Test:
shapiro.test(SJCounts$Normalized_DG[SJCounts$GROUP == "WT"])
##
## Shapiro-Wilk normality test
##
## data: SJCounts$Normalized_DG[SJCounts$GROUP == "WT"]
## W = 0.73658, p-value < 2.2e-16
Value of Mean Normalized Expression of the Alternative SJ in WT samples:
mean_WT_SJi <- mean(SJCounts$Normalized_DG[SJCounts$GROUP == "WT"], na.rm=TRUE)
mean_WT_SJi
## [1] 0.05049228
Normalized Expression Value of the Alternative SJ in the MUT sample:
MUT_SJi <- SJCounts$Normalized_DG[SJCounts$GROUP == "MUT"]
MUT_SJi
## [1] 0.1307190 0.1494768
Deviation from the mean normalized expression:
SJCounts$Difference <- SJCounts$Normalized_DG - mean_WT_SJi
Difference in the MUT sample: deviation of the Normalized expression of the MUT patient from the mean normalized WT expression
SJCounts$Difference[SJCounts$GROUP == "MUT"]
## [1] 0.08022668 0.09898455
v_ecdf <- ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
v_ecdf
## Empirical CDF
## Call: ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"])
## x[1:217] = -0.050492, -0.034438, -0.032458, ..., 0.3371, 0.35437
plot(ecdf(SJCounts$Difference[SJCounts$GROUP == "WT"]))
v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
## [1] 0.8791209 0.9142857
MUT_df <- SJCounts[SJCounts$GROUP == "MUT",c("sample_id","case_id", "Normalized_DG")]
colnames(MUT_df) <- c("sample_id", "case_id", "NormalizedExpression")
MUT_df$ECDF <- v_ecdf(SJCounts$Difference[SJCounts$GROUP == "MUT"])
MUT_df$Pvalue <- 1 - MUT_df$ECDF
MUT_df$Prediction <- "Donor Gain"
MUT_df$splice_junction_status <- "AlternativeSJ found in MUT samples"
MUT_df$splice_junction_position <- "chrX:45062720-45063421"
MUT_df$ECDF <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$ECDF)
MUT_df$Pvalue <- ifelse(MUT_df$NormalizedExpression == 0, NA,MUT_df$Pvalue)
Download the vaf, inferred percentiles and pvalues of the mutated samples:
Download the vaf, inferred percentiles and pvalues of all the splicing alterations evaluated: